JDK 21 Key Feature: Understanding Virtual Threads

When discussing the key features of Java Development Kit (JDK) 21, one noteworthy update is the introduction of virtual threads, which is highlighted on the OpenJDK website as part of the Loom project.

Why Use Virtual Threads?

  • For those who have been following technological advancements, we've addressed similar questions in our previous article about Kotlin coroutines. However, to better understand this concept, it's important to briefly reiterate the problems that virtual threads aim to address.

  • The Java Virtual Machine (JVM) provides a multi-threaded execution environment. In this context, the java.lang.Thread class offers an abstraction over operating system threads. Prior to the introduction of the Loom project, each thread within the JVM was essentially a simple wrapper around an operating system thread, which can be referred to as "platform threads." While this approach is straightforward and easy to understand, it becomes inefficient and resource-intensive under high-concurrency scenarios. As applications demand higher levels of concurrency, the traditional platform thread model begins to show its limitations, especially when dealing with large numbers of concurrent tasks.

  • The introduction of virtual threads aims to alleviate these issues. They allow developers to write non-blocking code without worrying about the complexities and overhead associated with managing underlying threads. By utilizing virtual threads, programs can handle a large number of concurrent tasks more efficiently while reducing the consumption of system resources.

    Issues with Platform Threads

  • Platform threads come with high costs in multiple aspects. Firstly, their creation is expensive. Whenever a platform thread is created, the operating system must allocate a significant amount of memory (typically megabytes) to store the thread's context, including both native and Java call stacks. This is due to the fixed nature of thread stacks, which cannot dynamically adjust their size according to actual needs. Additionally, whenever the scheduler preempts a thread from execution, it must move this substantial amount of memory.

  • As one might expect, this is an operation that is costly both in terms of space and time. In reality, the large size of stack frames strictly limits the number of threads that can be created. In Java applications, continuously instantiating new platform threads can quickly lead to an OutOfMemoryError, as the operating system will eventually exhaust its available memory.

private static void stackOverFlowErrorExample() {
  for (int i = 0; i < 100000000; i++) {
    new Thread(() -> {
      try {
        Thread.sleep(Duration.ofSeconds(1L));
      } catch (InterruptedException e) {
        throw new RuntimeException(e);
      }
    }).start();
  }
}

The exact outcome depends on the operating system and hardware, but we can easily encounter an OutOfMemoryError within seconds.
[0.949s][warning][os,thread] Failed to start thread "Unknown thread" - pthread_create failed (EAGAIN) for attributes: stacksize: 1024k, guardsize: 4k, detached. [0.949s][warning][os,thread] Failed to start the native thread for java.lang.Thread "Thread-4073" Exception in thread "main" java.lang.OutOfMemoryError: unable to create native thread: possibly out of memory or process/resource limits reached

Virtual threads represent a new type of thread in Java, introduced to address certain issues and limitations inherent in the traditional thread model. Here are some reasons for adopting virtual threads:

  • Resource Efficiency: Traditional platform threads typically require a substantial amount of memory allocated for the thread stack. In contrast, virtual threads store their thread stacks in heap memory, consuming significantly less initial memory—often just a few hundred bytes rather than megabytes. This means you can create a larger number of threads without worrying about excessive memory usage.
  • Thread Management: Creating and managing virtual threads is relatively simpler; new virtual threads can be spawned effortlessly using factory methods, eliminating the need for manual thread resource management.
  • Avoiding Thread Explosion: With the traditional thread model, creating a large number of threads can lead to thread explosion due to the significant memory footprint of each thread. Since virtual threads consume fewer resources, they make handling a high volume of concurrent tasks much more manageable without risking resource exhaustion.
  • Cooperative Scheduling: Virtual threads operate on a cooperative scheduling model, meaning they can voluntarily yield execution, unlike the preemptive scheduling used by traditional threads. This approach helps avoid contention for locks and reduces the overhead of context switching, thereby enhancing the performance of multithreaded applications.
  • Avoiding Blocking: When traditional threads encounter blocking operations, they remain idle until unblocked. Virtual threads, however, can yield execution during blocking operations, allowing other virtual threads to proceed, thus improving the responsiveness of the application.

How to Create Virtual Threads

As we have mentioned, virtual threads are a new type of thread designed to overcome the resource limitations of platform threads. They are an alternative implementation of the java.lang.Thread type, storing stack frames in the heap (garbage-collected memory) instead of the traditional stack.

Consequently, the initial memory footprint of virtual threads is often minimal, typically only a few hundred bytes rather than megabytes. In fact, stack segments can be resized at any time, so there is no need to allocate large amounts of memory to accommodate every possible use case.

Creating new virtual threads is straightforward. We can utilize a new factory method on the java.lang.Thread type. First, let’s define a utility function to create a virtual thread with a given name:

import java.lang.Thread;

public class VirtualThreadUtil {

    /**
     * Creates and returns a new instance of a virtual thread.
     *
     * @param name The name of the virtual thread.
     * @return A new virtual thread.
     */
    public static Thread createVirtualThread(String name) {
        // Use the static factory method of the Thread class to create a virtual thread
        return Thread.ofVirtual()
                .unstarted(() -> {
                    // Place the specific logic of what the thread should do here
                    System.out.println("Virtual thread " + Thread.currentThread().getName() + " is running...");
                })
                .name(name);
    }

    public static void main(String[] args) {
        // Create and start a virtual thread
        Thread virtualThread = createVirtualThread("Example Virtual Thread");
        virtualThread.start();
    }
}

In this example, we define a method named createVirtualThread that takes a string parameter name, which is used to set the name of the virtual thread. Then, we use Thread.ofVirtual() to create a virtual thread and specify the entry point (Runnable) via the unstarted method. Finally, we name the thread using the name method and start the thread by calling the start method.
In addition to factory methods, we can also utilize a new implementation of java.util.concurrent.ExecutorService tailored specifically for virtual threads, known as java.util.concurrent.ThreadPerTaskExecutor. Its name is quite descriptive—it creates a new virtual thread for each task submitted to the executor.

Below is an example of how to use ThreadPerTaskExecutor to create an execution service:

@SneakyThrows
static void concurrentMorningRoutineUsingExecutors() {
  try (var executor = Executors.newVirtualThreadPerTaskExecutor()) {
    var bathTime =
      executor.submit(
        () -> {
          log("I'm going to take a bath");
          sleep(Duration.ofMillis(500L));
          log("I'm done with the bath");
        });
    var boilingWater =
      executor.submit(
        () -> {
          log("I'm going to boil some water");
          sleep(Duration.ofSeconds(1L));
          log("I'm done with the water");
        });
    bathTime.get();
    boilingWater.get();
  }
}

When using ExecutorService, the way threads are started is different. Each call to the submit method requires passing a Runnable or Callable object. The submit method returns a Future object, through which we can wait for the completion of the underlying virtual thread.
As we have seen, threads created in this manner do not have names, which can make debugging errors difficult. We can solve this issue by using the newThreadPerTaskExecutor factory method with a ThreadFactory parameter:

@SneakyThrows
static void concurrentMorningRoutineUsingExecutorsWithName() {
  final ThreadFactory factory = Thread.ofVirtual().name("routine-", 0).factory();
  try (var executor = Executors.newThreadPerTaskExecutor(factory)) {
    var bathTime =
      executor.submit(
        () -> {
          log("I'm going to take a bath");
          sleep(Duration.ofMillis(500L));
          log("I'm done with the bath");
         });
    var boilingWater =
      executor.submit(
        () -> {
          log("I'm going to boil some water");
          sleep(Duration.ofSeconds(1L));
          log("I'm done with the water");
        });
    bathTime.get();
    boilingWater.get();
  }
}

How Virtual Threads WorkHow Virtual Threads Work

How do virtual threads work? The diagram below shows the relationship between virtual threads and platform threads:

The JVM maintains a pool of platform threads that is created and managed by a dedicated ForkJoinPool. Initially, the number of platform threads is equal to the number of CPU cores, with a maximum limit of 256.

For each virtual thread created, the JVM schedules its execution onto a platform thread, temporarily copying the stack segment of the virtual thread from the heap to the stack of the platform thread. We say that the platform thread becomes the carrier thread of the virtual thread.

So far, the logs we have seen perfectly illustrate the aforementioned situation. Let’s analyze one of these logs:
[routine-1] INFO in.rcard.virtual.threads.App - VirtualThread[#23,routine-1]/runnable@ForkJoinPool-1-worker-2 | I'm going to boil some water
The interesting part is located to the left of the vertical bar (|) character.

The first part identifies the executing virtual thread: VirtualThread[#23,routine-1] reports the thread identifier (#23 part) and the thread name.
Then, we can see on which carrier thread the virtual thread is executing: ForkJoinPool-1-worker-2 represents the worker-2 platform thread of the default ForkJoinPool.
The first time the virtual thread blocks on a blocking operation, the carrier thread is released, and the stack segment of the virtual thread is copied back to the heap. This way, the carrier thread can execute any other eligible virtual thread.
Once the blocked virtual thread completes its blocking operation, the scheduler queues it for execution again. Execution can resume on the same carrier thread or a different one.
We can easily see that the number of carrier threads available by default is equal to the number of CPU cores. On my machine, I have 2 physical cores and 4 logical cores. Consequently, we could create a program that generates a number of virtual threads equal to the number of logical cores plus one.

static void viewCarrierThreadPoolSize() {
  final ThreadFactory factory = Thread.ofVirtual().name("routine-", 0).factory();
  try (var executor = Executors.newThreadPerTaskExecutor(factory)) {
    IntStream.range(0, numberOfCores() + 1)
        .forEach(i -> executor.submit(() -> {
          log("Hello, I'm a virtual thread number " + i);
          sleep(Duration.ofSeconds(1L));
        }));
  }
}

Expectation of Virtual Threads Execution

We expect five virtual threads to execute on four carrier threads, with at least one carrier thread being reused. Running the program, we can see that our assumption is correct:

The Interesting Part Is to the Left of the Vertical Bar (|)

The interesting part is located to the left of the vertical bar (|) character.

  1. The first part identifies the executing virtual thread: VirtualThread[#23,routine-1] reports the thread identifier (#23 part) and the thread name.
  2. Then, we can see on which carrier thread the virtual thread is executing: ForkJoinPool-1-worker-2 represents the worker-2 platform thread of the default ForkJoinPool.
  3. The first time the virtual thread blocks on a blocking operation, the carrier thread is released, and the stack segment of the virtual thread is copied back to the heap. This way, the carrier thread can execute any other eligible virtual thread.
  4. Once the blocked virtual thread completes its blocking operation, the scheduler queues it for execution again. Execution can resume on the same carrier thread or a different one.

We can easily see that the number of carrier threads available by default is equal to the number of CPU cores. On my machine, I have 2 physical cores and 4 logical cores. Consequently, we could create a program that generates a number of virtual threads equal to the number of logical cores plus one.

We expect five virtual threads to execute on four carrier threads, with at least one carrier thread being reused. Running the program, we can see that our expectation is validated.

08:44:54.849 [routine-0] INFO in.rcard.virtual.threads.App - VirtualThread[#21,routine-0]/runnable@ForkJoinPool-1-worker-1 | Hello, I'm a virtual thread number 0
08:44:54.849 [routine-1] INFO in.rcard.virtual.threads.App - VirtualThread[#23,routine-1]/runnable@ForkJoinPool-1-worker-2 | Hello, I'm a virtual thread number 1
08:44:54.849 [routine-2] INFO in.rcard.virtual.threads.App - VirtualThread[#24,routine-2]/runnable@ForkJoinPool-1-worker-3 | Hello, I'm a virtual thread number 2
08:44:54.855 [routine-4] INFO in.rcard.virtual.threads.App - VirtualThread[#26,routine-4]/runnable@ForkJoinPool-1-worker-4 | Hello, I'm a virtual thread number 4
08:44:54.849 [routine-3] INFO in.rcard.virtual.threads.App - VirtualThread[#25,routine-3]/runnable@ForkJoinPool-1-worker-4 | Hello, I'm a virtual thread number 3

There are four carrier threads identified as ForkJoinPool-1-worker-1, ForkJoinPool-1-worker-2, ForkJoinPool-1-worker-3, and ForkJoinPool-1-worker-4, with ForkJoinPool-1-worker-4 being reused twice.

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注