Video Summary and Transcription
Hello, I'm grateful that you chose to watch my talk. Thank you. Last week, I had to run some unit tests while I was coding. Business performance can be awesome. It has very interesting and outstanding features. These test sets are overly complex. It's easy to find yourself troubleshooting for hours. VTest has almost 200 configuration options. By explaining how things work, we can answer questions about using complex third-party runners versus the built-in node.js runner. Understanding better helps avoid pitfalls, troubleshoot faster, and identify opportunities for configuring things better. The CLI layer does validation and hands over to the VTest class. It searches for test files, runs them efficiently using a pool of workers, and ensures isolation to prevent issues with dirty state. VTest uses the piscina library to facilitate work with multiple workers and improve performance. The workers prepare by setting the right globals, initializing snapshots, deciding the test runner file, and starting coverage collection. Each worker executes all the tests in a file and sends an RPC message when done. The decision to reuse or create a fresh worker is determined by the isolate configuration option. Consider spreading tests among more files to fully utilize the machine. Choose between process, thread, or VM as the worker type. In terms of isolation, you can choose between using a process, a thread, or a VM. Process is the heaviest in terms of performance cost, while thread is lighter. VM performance depends on known slower aspects like accessing globals. Process is the classic choice for stability, but thread has limitations and known issues. VM has reporting issues with memory leaks. Benchmark results showed that using multiple processes was 50% better for real-world applications, but for unit tests, one process was ten times faster. Thread was slightly faster than process, and VM was slower. The price of isolation with process worker types was approximately three minutes. Without isolation, the tests lasted only two minutes, much faster, but with a few failures. Threads showed similar results with a few failures. The risk of dealing with testing issues increases without isolation. By default, tests run sequentially inside workers, but you can configure them to run in parallel using the 'concurrent' keyword. However, tests still run sequentially despite specifying 'concurrent'. Concurrency in VTest is based on promises and requires asynchronous work. Unit tests run sequentially and concurrency has no isolation. Mocking in one test affects other tests running simultaneously. In choosing worker configurations, it depends on the context. In-file concurrency is best avoided, and the process worker type is recommended as the default. Isolation is crucial in integration tests but not mandatory in unit tests. Inside the worker, a TypeScript file is handled, and failures can occur when mocking functions. Mocking doesn't work in the worker. The worker handles TypeScript files and transpiles them using Vite server. Vite server replaces imports and hoists the mocking. Vite introduces a new module loader in the runtime. Vite hoisted the mock to the first line in the transform code to make it before the import. Additionally, Vite changes mocks to dynamic imports, ensuring that mocking works. Vite intercepts function calls on the file import level, but cannot intercept calls between functions in the same file. Moving function two to a different file or using an object export can solve this issue. Function one calls function two on the same object context. Use Spy to replace functions inside the object. Vite offers a range of plugins for different functionalities. You can fix import issues by customizing Vite's runtime. The Vite server handles dependency resolution and transformation. Consider using the built-in test runner or mockup for small-scale testing. Gain a better testing experience with Vite's customized runtime.
1. Introduction to VTest
Hello, I'm grateful that you chose to watch my talk. Thank you. Last week, I had to run some unit tests while I was coding. Business performance can be awesome. It has very interesting and outstanding features. These test sets are overly complex. It's easy to find yourself troubleshooting for hours. VTest has almost 200 configuration options.
Hello, I'm grateful that you chose to watch my talk. Thank you.
Last week, I had to run some unit tests while I was coding. And any time I changed something in the code, the test ran on the second screen. I had 200 tests, and they ran in 170 milliseconds. Wow. That was so blazingly fast. Like, I could blink and miss the execution. Business performance can be awesome.
Now, it's not only about performance. It has also very interesting and outstanding features. Like, did you know that you can write a test with the same syntax but just get alerts when you have a performance regression? Another example, you can run tests that will do type checking in compile time and other many unusual, beautiful, useful features. But all of these goodies come with price. These test sets are overly complex. It's very easy to find yourself trouble shooting for hours, for example, why did this walking work? And also make the wrong architectural decision because of inefficient knowledge on what's happening there inside. To give you an example, VTest has almost 200 configuration options. You can like spend days reading and trying to understand all of your options to compare node.js built-in test runners as only 19.
2. Exploring VTest Execution
By explaining how things work, we can answer questions about using complex third-party runners versus the built-in node.js runner. Understanding better helps avoid pitfalls, troubleshoot faster, and identify opportunities for configuring things better. Our journey with VTest starts when someone types the command to run tests. The CLI layer does validation and hands over to the VTest class. It searches for test files, runs them efficiently using a pool of workers, and ensures isolation to prevent issues with dirty state. VTest uses the piscina library to facilitate work with multiple workers and improve performance.
This is my hope today. By explaining to you how things work, we can answer questions like should you even use this complex third party runners or use any other simplistic one like the built-in node.js runner? And also by understanding better, I hope that you can avoid common pitfalls, troubleshoot faster, and also identify opportunities for configuring things better and get more features out of it.
Ready to dive into this rabbit hole with me? Great. Just before that, I'm Emily Goldberg. I'm a backend and testing consultant. I work with more than 50 companies worldwide, big names and startup in a garage on writing simple frontend and backend testing. I try to summarize all the knowledge and practices that I learned in my GitHub repo which overall was cleared by more than 135,000 people. If you want to learn more about how to test modern backend advanced stuff, then be my guest at testjavascript.com.
So our journey starts when someone type VTest. The VTest command to start running test provides this globe pattern to find all the files that start with this prefix. And the first thing that happens is that the CLI layer of VTest is being hit, does mostly simple validation. And then it's handed over to the big VTest class. This is kind of a God class that has the reference to the measure of services. And it's kind of orchestrating this flow. For a start, it will do all sorts of casual stuff like install dependencies around the global set up books. Then it will call a globe class that will search for all the test files that match this pattern. And the outcome of this is obviously a list of TypeScript files to execute.
Now, VTest needs to run all of these files. Naively, it can run them all one by one sequentially. But then if each one takes last one second, overall it will last three seconds. And this is a waste in a multi-core machine. Other issues to be concerned about is isolation. Each test file might leave traces if they all run over the same Node.js environment. And by traces, I mean things like changing environment variables, mocking, the cache of modules, connection pools, and more. Then when the next test will run, it will might face all of these, all of this dirty state and face issues.
So VTest wants to do better. And the way it works is by grabbing these files and using a pool of workers. Under the hood, it uses a library that's called piscina, a future one that was created by Matteo Colina and James Nell. And piscina facilitates the work with multiple workers. It allows you to choose whether it's a process or a thread, and also allows to reuse existing workers for better performance.
3. Preparing Workers and Making Decisions
The workers prepare by setting the right globals, initializing snapshots, deciding the test runner file, and starting coverage collection. Each worker executes all the tests in a file and sends an RPC message when done. The decision to reuse or create a fresh worker is determined by the isolate configuration option. Consider spreading tests among more files to fully utilize the machine. Choose between process, thread, or VM as the worker type.
Once the workers are at, before executing tests, they need to prepare. This is the preparation phase, which includes things like you can choose the testing environment. It can be front-end, back-end. So VTest needs to set the right globals and then initialize snapshots, all the snapshot data, decide about the test runner file, start coverage collection, and more. Then we are ready.
The pool grabs the first file and it sends it to worker number one. The second one to worker number two. Obviously, in real time, we'll have more workers than two, but I'll try to simplify it here. Each worker now executes all the tests in the file. We'll delve deeper into this section soon, and then they're done. They send back an RPC message. We're done. Now we need to process the third file, file number three.
So here we have a decision to make. Should we reuse this worker or create a fresh one? This is determined by the configuration option. In VTest, it's called isolate. If you choose true to isolate, then what the pool will do is create worker number three, a fresh new one, which will handle the next file, file number three. Alternatively, if you specify isolate false, it will reuse worker number one. It will skip the time it takes to instantiate a new worker and just reuse it. There's one thing we can learn from this model. If you have a very few number of test files, like two or three, it's quite common in microservices, and you got yourself a shiny new M4 Apple Silicon, then it's like driving a Ferrari in a lane with high traffic. You're utilizing a very, very small portion of your machine. In other words, the number of files is the top limit of parallelization that VTest and Jest can do for you. So it's something to consider and spread your test among more files. The next decision to make is which type of worker do you need? Process, a thread, or a VM?
Now, you know what process and thread are, but what is a VM? Anytime you're using the Chrome browser, with the V8 engine, of course, and you're using new tabs, each tab gets its own sandbox environment to separate the memory and storage. This feature is called V8 context. Node.js wraps this in a model that's called VM that allows you to run JavaScript file in a fully isolated and sandboxed. What VTest and Jest allows is running each test file in the same V8 environment, no need to create a new worker, but in a different context. Well, this way we achieve full isolation. Eventually, we are three options to choose from.
4. Choosing the Right Worker Type
In terms of isolation, you can choose between using a process, a thread, or a VM. Process is the heaviest in terms of performance cost, while thread is lighter. VM performance depends on known slower aspects like accessing globals. Process is the classic choice for stability, but thread has limitations and known issues. VM has reporting issues with memory leaks. Benchmark results showed that using multiple processes was 50% better for real-world applications, but for unit tests, one process was ten times faster. Thread was slightly faster than process, and VM was slower. The price of isolation with process worker types was approximately three minutes.
Let's make a thoughtful decision. Should you use a process, a thread, or a VM? In terms of isolation, when using a process, if you want to isolate your test file, you have to create a new process every time. Same thing about thread, but with VM, you don't have to. You can reuse the same worker, but just create a new V8 context. In terms of performance, the performance cost. Process is the heaviest. You'll wait, you'll pay much more for instantiating a new one all the time. Thread is a little lighter. In VM, it depends. It depends on things that are known to be slower when using VM, like accessing globals. What about stability? How safe is this option? Well, process is the classic choice for everyone because this is how your production code works. You have a fresh new process that holds your framework. Thread, though, has more limitations than known issues. Just one example, there are known bidires in our industry, like Prisma and Bcrypt, that are assumed to leave one in a process. But once you're running multiple threads, there are known issues and limitations related to this. Also, VM has a lot of reporting issues about memory leaks. Search the Jest issues, and you'll find at least dozens of open issues about memory leaks when using VM. So, eventually, performance is a big factor in this decision, and you can't make a thoughtful decision without some benchmark. So, I created some for you. Using a real-world application, intentionally using NestJS, which is a more heavier framework, 400 code files, 300 integration tests with database. And these were my findings. First question was about, does parallelism even bring any value? The first question that I tried to answer was, does parallelization bring any value? It wasn't easy, by the way, because when I ran it with a Playground application, I didn't see any noticeable benefit for running things in parallel. But when I ran this with real-world systems, what I noticed is that running with multiple processes was like 50% better, which makes sense, given that you have all of these moving parts. However, with unit tests, what I observed is that one process was ten times faster than multiple, because obviously the time it takes to grab, to instantiate a new worker is by itself longer than executing all the tests. The next benchmark is about which worker type is fastest. So, when I ran a process, 300 tests with process site, it lasted approximately three minutes. Thread was only ten seconds faster, and VM was actually slower. And this is documented in many places that certain things in VM are actually slower, while it was intended to be faster. The next benchmark is about the price of isolation. How much would it take more for you to isolate between test sites? So, with process worker types, and with isolation, it lasted about three minutes.
5. Understanding Test Isolation and Concurrency
Without isolation, the tests lasted only two minutes, much faster, but with a few failures. Threads showed similar results with a few failures. The risk of dealing with testing issues increases without isolation. By default, tests run sequentially inside workers, but you can configure them to run in parallel using the 'concurrent' keyword. However, tests still run sequentially despite specifying 'concurrent'. Concurrency in VTest is based on promises and requires asynchronous work. Unit tests run sequentially and concurrency has no isolation. Mocking in one test affects other tests running simultaneously.
Without isolation, though, it lasted only two minutes, much faster. But there were a few failures. With threads, same numbers, but again, few failures. I could probably fix the failures, I guess, but as you can see, there are more risks and more chances that you will get to deal with testing issues.
To get the full picture, you need also to understand the following. By default, inside the workers, the tests run sequentially, like this. But you can configure this. If you specify the concurrent keyword here, then the test will run, going by the documentation we did with me. Use concurrent to run them parallel.
So, I checked this. And as you can see, we have multiple tests here. Each one writes to the log, its number. And I tried this with many more tests. And when I ran them, I would expect to see in the log some random order of log entries. But in fact, what I got is that the test actually runs in sequence. So, I looked into the source code of VTest, and I saw this. That's the main function that runs tests. And then, as you can see, the run test function is wrapped by limitMaxConcurrency function that will try to set the number of concurrency that tests run by.
And what we learned from this, that this concurrency is based on promises. It's like running the test in PromiseAlt. So, it's based on Node.js concurrency that demands that you're doing some async work. With unit test, you won't. So, if you're running unit test, it will actually be sequential. And not only this, also concurrency has no isolation at all. When you're running multiple tests in the same memory space, if you're doing mocking in one test, it will apply and affect other tests run at the same time.
6. Understanding Worker Model and Code Failures
Concurrency in VTest has no isolation and can affect other tests running simultaneously. In choosing worker configurations, it depends on the context. In-file concurrency is best avoided, and the process worker type is recommended as the default. Isolation is crucial in integration tests but not mandatory in unit tests. Inside the worker, a TypeScript file is handled, and failures can occur when mocking functions.
So, it's based on Node.js concurrency that demands that you're doing some async work. With unit test, you won't. So, if you're running unit test, it will actually be sequential. And not only this, also concurrency has no isolation at all. When you're running multiple tests in the same memory space, if you're doing mocking in one test, it will apply and affect other tests run at the same time.
Now you understand the worker's model. Which configuration should you choose? It depends. But I'll try to suggest a few rules of thumb. As of in-file concurrency, I think you want to avoid it by default for all the reasons that were mentioned, like no isolation, etc. What about worker types? Which one should be the default? I suggest process. Because it's the safest object and the performance penalty is not that high. What about isolation? Well, in integration tests, there are so many moving parts that not starting from a clean slate has high chances of facing some related issues. So, I suggest it's a good default. But in unit tests that run so fast, the performance penalty of spilling up a new process or certain code are higher. And it's much easier, actually, to ensure a stateless environment in unit tests. So, I think that isolation is not a mandatory option in unit tests.
What we have seen thus far is how VTest manages all of its workers. But what's happening inside the worker? This is interesting, not less. So, what's happening is that a worker gets a TypeScript file to handle. And before I show you how it handles it, I want to show you some code failure. And then, by understanding the internals, we will be able to explain these failures and fix it. But what we have here is a very simple code to test. Function1 calls function2 that prints out real function2. Easy and simple. Now, the test of this starts with a mocking. We try to mock this code and make function2 return a different string, not function2. So, then once we call function1, it under the hood calls function2 and should return not a real, but rather mock function2. This should work. This is perfectly fine. However, in reality, it fails.
7. Understanding Worker Handling and Transpilation
Mocking doesn't work in the worker. The worker handles TypeScript files and transpiles them using Vite server. Vite server replaces imports and hoists the mocking. The transformed code replaces import statements with Vite import functions.
We get the real, the wrong output, real function2 instead of mocking. In other words, mocking here didn't work. But why? Let's try to understand. So, the worker is about to handle this TypeScript file. The first thing that it does, it looks through all the files of the mocking and notes down all the mocking that it finds statically in the files. It creates a mocker class that holds a registry, just a dictionary. And any time it realizes a mock, like in our case, remember, it just noted out what I found a mock in test by one that applies to code one. And these are the substitutes. Remember this registry because it will come back very soon to our story.
Now, since it wants to execute all of these files, and it can't because it's TypeScript, it needs to transpile it first, right? So, I thought that it will call some kind of TSC or SWC, any kind of compiler. But what I found in the code was a little different. The function transform request is responsible for this. And I noticed that it's called something that is called server. That is a third party NPM dependency to Vite server. So, if you are not familiar with it, Vite server comes from the front-end world. It's a tool that specializes in bundling, transpiling, and transforming code. And it makes sense to use it here because Vitest is a sister project of Vite. And what we need to do is transpile. And Vite server will actually do three things. Two are interesting. The first one is transpile, obviously, but it will also replace the imports and hoist the mocking.
To set a transformation, we can use a very nice tool that Vitest packs, the AUI tool that shows you all the module graphs. It's a great valuable way for benchmarking and analyzing how it handles your files. And once you click on some source file, you'll get a view that shows you the source on the left and the transformed code on the right. Now, I need your focus because very interesting and unusual things are happening here. The first thing you might notice is that we have an import statement on the left. They are replaced with Vite import functions. Now, this is a bold move because import is part of JavaScript. It's part of the runtime.
8. Vite's Customized Runtimes and Module Loader
Vite introduces a new module loader in the runtime.
It's part of the runtime. And Vite customized the runtimes and introduced a new module loader, which is something to be aware of. Why does it do so? We will understand very soon.
9. Vite's Mocking Transformations and Test Failures
Vite hoisted the mock to the first line in the transform code to make it before the import. Additionally, Vite changes mocks to dynamic imports, ensuring that mocking works. Vite intercepts function calls on the file import level, but cannot intercept calls between functions in the same file. Moving function two to a different file or using an object export can solve this issue.
This is not the only transformation. See here, we tried to mock function number two and change it before our code imported and used the real one, but import was already happened, so it's too late. For this reason, as you can see, since it happens on line number eight in the source code, Vite hoisted and put it on the first line in the transform code to make it before the import.
However, this is not enough because import statements always happen first. They are statically analyzed before the runtime. So what Vite does in addition is any time it notes a mock, it will change it to dynamic import. See here, the await. Now, we have the mock first and then sequentially the import of the code. And now mocking should work. This is how Vite transformed the file and returns back a JS to the worker to process it. Now the worker can run the file.
So it's about to run test file one that calls code one. But because it's done using the Vite import, it can look out through the mock registry and note, hey, we have a mock for this file. So let's use our mock and not the real file. So it intersects the code and make the code substitute the real one. So back to the failing test. Look at it. Now you should have a clue why did it fail and to make it super clear, we understand that what Vite is capable of doing is stuff like the following. We have a file that calls another file. And then when function one in code one will call a function in a different file, it can intercept it because it works on the import level, on the file import level. Both Jest and Vitest work this way. But what they are not capable of doing is when we have both function in the same file. So if function one calls function two, they don't have a mechanism to intercept this call. It happens in memory. It doesn't go through the import system. And in that case, mocking will fail. This is why our test failed, our mocking failed. How can we fix this? First, by this understanding, you can fix all sorts of mocking issues. For here, for example, we can indeed move function two to a different file. Another solution would be to make it all as part of the same object, export an object instead of name export.
10. Vite's Customized Runtime and Import Issues
Function one calls function two on the same object context. Use Spy to replace functions inside the object. Vite offers a range of plugins for different functionalities. You can fix import issues by customizing Vite's runtime. The Vite server handles dependency resolution and transformation. Consider using the built-in test runner or mockup for small-scale testing. Gain a better testing experience with Vite's customized runtime.
So as you can see here, now function one calls function two on the same object context. And then we would need to change the test file and use Spy. Spy is a different mechanism than mock. It doesn't work on the import level. But rather, it assumes that the object already lives in memory. And when the object is in this type, it can just replace functions inside. And then this will just work.
Other lesson to learn from this diagram is that under the hood, we have V. And we can take it to our benefit. It has a huge ecosystem of plugins. So for example, if you want a fast Node.js server with auto reload, you got it. There is an existing plugin for this. You want ESLint Check on the fly, got it. And hundreds of other plugins, most of them front-end, to be honest. And now you can fix all sorts of import issues.
For example, if you try to import function two, it's quite common to get all sorts of failures. Like a member doesn't exist, undefined, and so on. And then you ask yourself, to which rules should I follow here? ETCS-Tenom or CommonJS? And the answer is neither. It's a custom runtime, custom model systems that Vitest just created. But now, at least you know, to go to the configuration, and you understand that server is where the Vite server exists. And it handles all the dependency resolution and transformation. And now you can look at all the various configuration and change things. If you want to avoid transforming some libraries of yours or third-party, you can just exclude it here. If you want to gain much more control about how it treats various CSM or CGS models, there are other entries for this. But now you know where to look and where to analyze and tune this system.
However, I would positively consider the built-in OGS test runner or mockup. If you are just testing the library or having a small microservice with not so many tests, or just running a few unit tests, because in these cases, you pay the price, but get much less value. I hope that by now you understand how it works and the various considerations to apply it. So you have a better testing experience. Thank you very much.
Comments