Thursday 31 May 2012

Linux Kernel Process


 what is the process?
A process is an instance of a computer program that is being executed. But the Processes are more than just the executing program code( often called the text section in Unix, the program has some section with in it(data section, text section, BSS section...)). Because the process also can include the set of resources such ass open files and pending signals, internal kernel data, one or more threads of execution.

Now we can see the Your process in the OS:

At Linux, you may run the command ps,
ps aux

USER             PID  %CPU %MEM      VSZ    RSS   TT  STAT STARTED      TIME COMMAND
reg              637   5.6 14.9  3266892 623324   ??  S     9:55上午   1:40.51 /Applications/VirtualBox.app/Contents/MacOS/../Resources/VirtualBoxVM.app/Contents




Above this, we can the process has the ID named (PID).

Create the process example.

In Linux, the fork() system calls  make the a process created by duplicating an existing one.
The process that calls fork() is the parent, whereas the new process is the child. The parent resumes execution and the child starts execution at the same place: where the call to fork() returns. The fork() system call returns from the kernel twice: once in the parent process and again in the newborn child process.

#include <unistd.h>
#include <sys/types>
#include <stdio.h>


int main()
{
  pid_t pid;
  printf("The parent calls fork!\n");
  pid = fork();
 
  if(pid < 0)
      printf("error in fork!\n")
   else if(pid == 0)
      printf("The child process, process id is %d\n", getpid());
   else
     printf("The parent process, process id is %d\n", getpid());
  return 0;
}



And the result is:
The parent calls fork!
The parent process, process id is 2377
The child process, process id is 2378
 
For the one process program, it's  amazing to run the if and the else, why it shows this?
Because All the resources owned by the parent are duplicated and the copy is given to the child.
So from the pid = fork() line, the child run the same code with the parent, so "The child process, process id is 2378" printed, the "The parent calls fork!" didn't print, the child process exec from the fork function.



What's in the process?



In the kernel stores the process use the process descriptor of  struct task_struct, which defined in the <linux/sched>.

With the process descriptor now dynamically created via the slab allocator, a new struct thread_info, was created that lives at the bottom of the stack(for stacks that grows down).
The thread_info structure is defined on x86 in <asm/thraed_info.h>.

thread_info->task is the process struct /* main task structure */.

Use the thread_info created at the bottom feature, the linux use it to store current process, because in the x86, the registers is so few to waste to store the current process address indivisibly.

static inline struct thread_info *current_thread_info(void)
{
        return (struct thread_info *)
                 (current_stack_pointer & ~(THREAD_SIZE - 1));
}

#ifdef CONFIG_4KSTACKS
#define THREAD_ORDER    0
#else
#define THREAD_ORDER    1
#endif
#define THREAD_SIZE     (PAGE_SIZE << THREAD_ORDER)
 
IF the  CONFIG_4KSTACKS defined means the STACK_SIZE is 4kb, ELSE

STACK_SIZE is 8kb.

current_stack_pointer is
register unsigned long current_stack_pointer asm("esp") __used;
means esp address .
So the 
(current_stack_pointer & ~(THREAD_SIZE - 1))   means get the current stack pointer address,
i.e. 0x01511fff, we assume the  THREAD_SIZE is 8kb, ~(THREAD_SIZE - 1) = 0xfffffe000, so we can
get the bottom of the stack is 0x0151e000. No matter stack pointer address changed, as long 
as it in the stack, it & ~(THREAD_SIZE - 1)  always get the bottom address of the stack 0x0151e000.
 
  
The Process State:

The state field of the process descriptor describes the current condition of the process. 
Each process on the system is in exactly one of five different states.This
value is represented by one of five flags:
 
 1. Task_Running. (The process is running)
 2. Task_Interruptable (The process is sleeping, it wait the signal to wake up)
 3. Task_Uninterruptable (The process is sleeping, it wait for the kernel function not the signal
, so the signal can't wake up it)
 4. __Task_Stopped (The process is not runnable, and not eligible to run. )
 5. __Task_Traced (it's for the debug.)

Kernel code often needs to change a process’s state.The preferred mechanism is using
set_task_state(task, state);

The process creation:
In the Unix system, you can use the fork() function to create the process, and you will create 
the child process of current process, has own PID, PPID, and some inherited from the parent process.
  
All resources owned by the parent are duplicated and the copy is given to the child.
This approach is naive and inefficient in that it copies much data that might otherwise be shared.Worse still, if the new process were to immediately
execute a new image, all that copying would go to waste. 
So we use the copy-on-write, it means when the child process created, it just get the pointer
to point the parent's data address, not copy the these data, until the child really use it(Write these data,
, read not cause the copy.)  So we delay the copy until the write, improve the process creation.

 
 
Thread:
 
Why the Thread produced?
  1. as times go on, people want to process can work more parallel things, we need them
    shared the memory address, and have the same resources. 
     
     2.  the thread is light weight than the process, it is quicker , easy to create, destroy. In many system,
          the thread creation time is 10-100 quicker than the process creation.
 
    3.  the thread is easy to communicate with other thread, than the process. the process can 
          not enter others address at most time.

Threads are a popular modern programming abstraction.They provide multiple threads of
execution within the same program in a shared memory address space.They can also
share open files and other resources.Threads enable concurrent programming and, on multiple
processor systems, true parallelism. 

Threads like the process shared its address with others processes, so in Linux, thread struct  is process struct,
The thread create clone:
 
clone(CLONE_VM | CLONE_FS | CLONE_FILES | CLONE_SIGHAND, 0);
 
The process create clone:
 
clone(SIGCHLD, 0); 
 
 
when the thread has more things created than the process creation,
  •  FILES: file descriptor 
  •   FS: filesystem resources
  • VM: address space
the process create the own FS, FILES, VM, not clone from the parent



 
 
 
 

No comments:

Post a Comment