资深程序员带你学Python!深入 Python 多进程编程基础!
发布时间:2020-12-17 01:17:18 所属栏目:Python 来源:网络整理
导读:p style="margin:10px auto;color:rgb(57,57,57);font-family:verdana,'ms song',Arial,Helvetica,sans-serif;font-size:14px;text-align:left;background-color:rgb(250,247,239);"多进程编程知识是Python程序员进阶高级的必备知识点,我们平时习惯了使用mu
<p style="margin:10px auto;color:rgb(57,57,57);font-family:verdana,'ms song',Arial,Helvetica,sans-serif;font-size:14px;text-align:left;background-color:rgb(250,247,239);">多进程编程知识是Python程序员进阶高级的必备知识点,我们平时习惯了使用multiprocessing库来操纵多进程,但是并不知道它的具体实现原理。下面我对多进程的常用知识点都简单列了一遍,使用原生的多进程方法调用,帮助读者理解多进程的实现机制。代码跑在linux环境下。没有linux条件的,可以使用docker或者虚拟机运行进行体验。 <pre class="prettyprint hljs nginx" style="margin-bottom:0px;padding:0px;white-space:pre-wrap;color:rgb(57,57);font-size:14px;text-align:left;background-color:rgb(250,239);"><span class="hljs-attribute" style="margin:0px;padding:0px;color:rgb(163,21,21);">docker pull python:<span class="hljs-number" style="margin:0px;padding:0px;">2.<span class="hljs-number" style="margin:0px;padding:0px;">7<h2 style="margin-top:10px;margin-bottom:10px;padding:0px;font-size:21px;line-height:1.5;color:rgb(57,sans-serif;text-align:left;background-color:rgb(250,239);">生成子进程<p style="margin:10px auto;color:rgb(57,239);">Python生成子进程使用?<code style="margin:0px;padding:0px;">os.fork()?,它将产生一个子进程。fork调用同时在父进程和主进程同时返回,在父进程中返回子进程的pid,在子进程中返回0,如果返回值小于零,说明子进程产生失败,一般是因为操作系统资源不足。 <code class="language-python"><span class="hljs-keyword" style="margin:0px;padding:0px;color:rgb(0,255);">import os<h2 style="margin-top:10px;margin-bottom:10px;padding:0px;font-size:21px;line-height:1.5;color:rgb(57,239);">生成多个子进程<p style="margin:10px auto;color:rgb(57,239);">我们调用?<code style="margin:0px;padding:0px;">create_child?方法多次就可以生成多个子进程,前提是必须保证?<code style="margin:0px;padding:0px;">create_child?是在父进程里执行,如果是子进程,就不要在调用了。<pre class="prettyprint hljs lua" style="margin-bottom:0px;padding:0px;white-space:pre-wrap;color:rgb(57,239);"># coding: utf<span class="hljs-number" style="margin:0px;padding:0px;">-8 # child.py import os def create_child(i): pid = <span class="hljs-built_in" style="margin:0px;padding:0px;color:rgb(0,255);">os.fork() <span class="hljs-keyword" style="margin:0px;padding:0px;color:rgb(0,255);">if pid > <span class="hljs-number" style="margin:0px;padding:0px;">0: <span class="hljs-built_in" style="margin:0px;padding:0px;color:rgb(0,255);">return pid elif pid == <span class="hljs-number" style="margin:0px;padding:0px;">0: <span class="hljs-built_in" style="margin:0px;padding:0px;color:rgb(0,21);">'in child process',i <span class="hljs-keyword" style="margin:0px;padding:0px;color:rgb(0,255);">return <span class="hljs-number" style="margin:0px;padding:0px;">0 <span class="hljs-keyword" style="margin:0px;padding:0px;color:rgb(0,255);">else: raise for i <span class="hljs-keyword" style="margin:0px;padding:0px;color:rgb(0,255);">in range(<span class="hljs-number" style="margin:0px;padding:0px;">10): # 循环<span class="hljs-number" style="margin:0px;padding:0px;">10次,创建<span class="hljs-number" style="margin:0px;padding:0px;">10个子进程 pid = create_child(i) # pid==<span class="hljs-number" style="margin:0px;padding:0px;">0是子进程,应该立即退出循环,否则子进程也会继续生成子进程 # 子子孙孙,那就生成太多进程了 <span class="hljs-keyword" style="margin:0px;padding:0px;color:rgb(0,255);">if pid == <span class="hljs-number" style="margin:0px;padding:0px;">0: <span class="hljs-keyword" style="margin:0px;padding:0px;color:rgb(0,255);">break <p style="margin:10px auto;color:rgb(57,239);">运行?<code style="margin:0px;padding:0px;">python child.py?,输出<pre class="prettyprint hljs powershell" style="margin-bottom:0px;padding:0px;white-space:pre-wrap;color:rgb(57,239);"><span class="hljs-keyword" style="margin:0px;padding:0px;color:rgb(0,255);">in father process in father process in child <span class="hljs-keyword" style="margin:0px;padding:0px;color:rgb(0,255);">process <span class="hljs-number" style="margin:0px;padding:0px;">0 <span class="hljs-keyword" style="margin:0px;padding:0px;color:rgb(0,255);">in child <span class="hljs-keyword" style="margin:0px;padding:0px;color:rgb(0,255);">process <span class="hljs-number" style="margin:0px;padding:0px;">1 <span class="hljs-keyword" style="margin:0px;padding:0px;color:rgb(0,255);">in father process in child <span class="hljs-keyword" style="margin:0px;padding:0px;color:rgb(0,255);">process <span class="hljs-number" style="margin:0px;padding:0px;">2 <span class="hljs-keyword" style="margin:0px;padding:0px;color:rgb(0,255);">process <span class="hljs-number" style="margin:0px;padding:0px;">3 <span class="hljs-keyword" style="margin:0px;padding:0px;color:rgb(0,255);">process <span class="hljs-number" style="margin:0px;padding:0px;">4 <span class="hljs-keyword" style="margin:0px;padding:0px;color:rgb(0,255);">process <span class="hljs-number" style="margin:0px;padding:0px;">5 <span class="hljs-keyword" style="margin:0px;padding:0px;color:rgb(0,255);">process <span class="hljs-number" style="margin:0px;padding:0px;">6 <span class="hljs-keyword" style="margin:0px;padding:0px;color:rgb(0,255);">process <span class="hljs-number" style="margin:0px;padding:0px;">7 <span class="hljs-keyword" style="margin:0px;padding:0px;color:rgb(0,255);">process <span class="hljs-number" style="margin:0px;padding:0px;">8 <span class="hljs-keyword" style="margin:0px;padding:0px;color:rgb(0,255);">process <span class="hljs-number" style="margin:0px;padding:0px;">9 <h2 style="margin-top:10px;margin-bottom:10px;padding:0px;font-size:21px;line-height:1.5;color:rgb(57,239);">进程休眠<p style="margin:10px auto;color:rgb(57,239);">使用time.sleep可以使进程休眠任意时间,单位为秒,可以是小数<pre class="prettyprint hljs go" style="margin-bottom:0px;padding:0px;white-space:pre-wrap;color:rgb(57,255);">import time for i in <span class="hljs-keyword" style="margin:0px;padding:0px;color:rgb(0,255);">range(<span class="hljs-number" style="margin:0px;padding:0px;">5): <span class="hljs-built_in" style="margin:0px;padding:0px;color:rgb(0,21);">'hello' time.sleep(<span class="hljs-number" style="margin:0px;padding:0px;">1) # 睡<span class="hljs-number" style="margin:0px;padding:0px;">1s <h2 style="margin-top:10px;margin-bottom:10px;padding:0px;font-size:21px;line-height:1.5;color:rgb(57,239);">杀死子进程<p style="margin:10px auto;color:rgb(57,239);">使用os.kill(pid,sig_num)可以向进程号为pid的子进程发送信号,sig_num常用的有SIGKILL(暴力杀死,相当于kill -9),SIGTERM(通知对方退出,相当于kill不带参数),SIGINT(相当于键盘的ctrl+c)。<pre class="prettyprint hljs vim" style="margin-bottom:0px;padding:0px;white-space:pre-wrap;color:rgb(57,239);"># codin<span class="hljs-variable" style="margin:0px;padding:0px;color:#008000;">g: utf-<span class="hljs-number" style="margin:0px;padding:0px;">8 # kill.py import os import time import signal def create_child(): pid = os.fork() <span class="hljs-keyword" style="margin:0px;padding:0px;color:rgb(0,255);">return pid elif pid == <span class="hljs-number" style="margin:0px;padding:0px;">0: <span class="hljs-keyword" style="margin:0px;padding:0px;color:rgb(0,255);">else: raise pid = create_child() <span class="hljs-keyword" style="margin:0px;padding:0px;color:rgb(0,255);">while True: # 子进程死循环打印字符串 <span class="hljs-keyword" style="margin:0px;padding:0px;color:rgb(0,21);">'in child process' time.<span class="hljs-keyword" style="margin:0px;padding:0px;color:rgb(0,255);">sleep(<span class="hljs-number" style="margin:0px;padding:0px;">1) <span class="hljs-keyword" style="margin:0px;padding:0px;color:rgb(0,21);">'in father process' time.<span class="hljs-keyword" style="margin:0px;padding:0px;color:rgb(0,255);">sleep(<span class="hljs-number" style="margin:0px;padding:0px;">5) # 父进程休眠<span class="hljs-number" style="margin:0px;padding:0px;">5s再杀死子进程 os.kill(pid,signal.SIGKILL) time.<span class="hljs-keyword" style="margin:0px;padding:0px;color:rgb(0,255);">sleep(<span class="hljs-number" style="margin:0px;padding:0px;">5) # 父进程继续休眠<span class="hljs-number" style="margin:0px;padding:0px;">5s观察子进程是否还有输出 <p style="margin:10px auto;color:rgb(57,239);">运行?<code style="margin:0px;padding:0px;">python kill.py?,我们看到控制台输出如下<pre class="prettyprint hljs go" style="margin-bottom:0px;padding:0px;white-space:pre-wrap;color:rgb(57,239);">in father process in child process # 等<span class="hljs-number" style="margin:0px;padding:0px;">1s in child process # 等<span class="hljs-number" style="margin:0px;padding:0px;">1s in child process # 等<span class="hljs-number" style="margin:0px;padding:0px;">1s in child process # 等<span class="hljs-number" style="margin:0px;padding:0px;">1s in child process # 等了<span class="hljs-number" style="margin:0px;padding:0px;">5s <p style="margin:10px auto;color:rgb(57,239);">说明os.kill执行之后,子进程已经停止输出了<h2 style="margin-top:10px;margin-bottom:10px;padding:0px;font-size:21px;line-height:1.5;color:rgb(57,239);">僵尸子进程<p style="margin:10px auto;color:rgb(57,239);">在上面的例子中,os.kill执行完之后,我们通过ps -ef|grep python快速观察进程的状态,可以发现子进程有一个奇怪的显示?<code style="margin:0px;padding:0px;"> root <span class="hljs-number" style="margin:0px;padding:0px;">13 <span class="hljs-number" style="margin:0px;padding:0px;">12 <span class="hljs-number" style="margin:0px;padding:0px;">0 <span class="hljs-number" style="margin:0px;padding:0px;">11:<span class="hljs-number" style="margin:0px;padding:0px;">22 <span class="hljs-keyword" style="margin:0px;padding:0px;color:rgb(0,255);">pts/<span class="hljs-number" style="margin:0px;padding:0px;">0 <span class="hljs-number" style="margin:0px;padding:0px;">00:<span class="hljs-number" style="margin:0px;padding:0px;">00:<span class="hljs-number" style="margin:0px;padding:0px;">00 [<span class="hljs-keyword" style="margin:0px;padding:0px;color:rgb(0,255);">python] <span class="hljs-symbol" style="margin:0px;padding:0px;color:rgb(0,176,232);"><<span class="hljs-symbol" style="margin:0px;padding:0px;color:rgb(0,232);">defunct<span class="hljs-symbol" style="margin:0px;padding:0px;color:rgb(0,232);">> <p style="margin:10px auto;color:rgb(57,239);">待父进程终止后,子进程也一块消失了。那?<code style="margin:0px;padding:0px;"> import os import time import signal def create_child(): pid = os.fork() <span class="hljs-keyword" style="margin:0px;padding:0px;color:rgb(0,255);">else: raise pid = create_child() <span class="hljs-keyword" style="margin:0px;padding:0px;color:rgb(0,signal.SIGTERM) <span class="hljs-keyword" style="margin:0px;padding:0px;color:rgb(0,255);">ret = os.waitpid(pid,<span class="hljs-number" style="margin:0px;padding:0px;">0) # 收割子进程 <span class="hljs-keyword" style="margin:0px;padding:0px;color:rgb(0,255);">print <span class="hljs-keyword" style="margin:0px;padding:0px;color:rgb(0,255);">ret # 看看到底返回了什么 time.<span class="hljs-keyword" style="margin:0px;padding:0px;color:rgb(0,255);">sleep(<span class="hljs-number" style="margin:0px;padding:0px;">5) # 父进程继续休眠<span class="hljs-number" style="margin:0px;padding:0px;">5s观察子进程是否还存在 <p style="margin:10px auto;color:rgb(57,239);">运行python kill.py输出如下<pre class="prettyprint hljs powershell" style="margin-bottom:0px;padding:0px;white-space:pre-wrap;color:rgb(57,255);">in father process in child process in child process in child process in child process in child process in child <span class="hljs-keyword" style="margin:0px;padding:0px;color:rgb(0,255);">process (<span class="hljs-number" style="margin:0px;padding:0px;">125,<span class="hljs-number" style="margin:0px;padding:0px;">9) <p style="margin:10px auto;color:rgb(57,239);">我们看到waitpid返回了一个tuple,第一个是子进程的pid,第二个9是什么含义呢,它在不同的操作系统上含义不尽相同,不过在Unix上,它通常的value是一个16位的整数值,前8位表示进程的退出状态,后8位表示导致进程退出的信号的整数值。所以本例中退出状态位0,信号编号位9,还记得?<code style="margin:0px;padding:0px;">kill -9?这个命令么,就是这个9表示暴力杀死进程。<p style="margin:10px auto;color:rgb(57,239);">如果我们将os.kill换一个信号才看结果,比如换成os.kill(pid,signal.SIGTERM),可以看到返回结果变成了?<code style="margin:0px;padding:0px;">(138,15)?,15就是SIGTERM信号的整数值。<p style="margin:10px auto;color:rgb(57,239);"><code style="margin:0px;padding:0px;">waitpid(pid,0)?还可以起到等待子进程结束的功能,如果子进程不结束,那么该调用会一直卡住。<h2 style="margin-top:10px;margin-bottom:10px;padding:0px;font-size:21px;line-height:1.5;color:rgb(57,239);">捕获信号<p style="margin:10px auto;color:rgb(57,239);">SIGTERM信号默认处理动作就是退出进程,其实我们还可以设置SIGTERM信号的处理函数,使得它不退出。<pre class="prettyprint hljs vim" style="margin-bottom:0px;padding:0px;white-space:pre-wrap;color:rgb(57,255);">if pid == <span class="hljs-number" style="margin:0px;padding:0px;">0: signal.signal(signal.SIGTERM,signal.SIG_IGN) <span class="hljs-keyword" style="margin:0px;padding:0px;color:rgb(0,signal.SIGTERM) # 发一个SIGTERM信号 time.<span class="hljs-keyword" style="margin:0px;padding:0px;color:rgb(0,255);">sleep(<span class="hljs-number" style="margin:0px;padding:0px;">5) # 父进程继续休眠<span class="hljs-number" style="margin:0px;padding:0px;">5s观察子进程是否还存在 os.kill(pid,signal.SIGKILL) # 发一个SIGKILL信号 time.<span class="hljs-keyword" style="margin:0px;padding:0px;color:rgb(0,255);">sleep(<span class="hljs-number" style="margin:0px;padding:0px;">5) # 父进程继续休眠<span class="hljs-number" style="margin:0px;padding:0px;">5s观察子进程是否还存在 <p style="margin:10px auto;color:rgb(57,239);">我们在子进程里设置了信号处理函数,SIG_IGN表示忽略信号。我们发现第一次调用os.kill之后,子进程会继续输出。说明子进程没有被杀死。第二次os.kill之后,子进程终于停止了输出。<p style="margin:10px auto;color:rgb(57,239);">接下来我们换一个自定义信号处理函数,子进程收到SIGTERM之后,打印一句话再退出。<pre class="prettyprint hljs vim" style="margin-bottom:0px;padding:0px;white-space:pre-wrap;color:rgb(57,239);"># codin<span class="hljs-variable" style="margin:0px;padding:0px;color:#008000;">g: utf-<span class="hljs-number" style="margin:0px;padding:0px;">8 import os import sys import time import signal def create_child(): pid = os.fork() <span class="hljs-keyword" style="margin:0px;padding:0px;color:rgb(0,255);">else: raise def i_will_die(sig_num,frame): # 自定义信号处理函数 <span class="hljs-keyword" style="margin:0px;padding:0px;color:rgb(0,21);">"child will die" sys.<span class="hljs-keyword" style="margin:0px;padding:0px;color:rgb(0,255);">exit(<span class="hljs-number" style="margin:0px;padding:0px;">0) pid = create_child() <span class="hljs-keyword" style="margin:0px;padding:0px;color:rgb(0,i_will_die) <span class="hljs-keyword" style="margin:0px;padding:0px;color:rgb(0,signal.SIGTERM) time.<span class="hljs-keyword" style="margin:0px;padding:0px;color:rgb(0,239);">输出如下<pre class="prettyprint hljs lua" style="margin-bottom:0px;padding:0px;white-space:pre-wrap;color:rgb(57,255);">in father process in child process in child process in child process in child process in child process child will die <p style="margin:10px auto;color:rgb(57,239);">信号处理函数有两个参数,第一个sig_num表示被捕获信号的整数值,第二个frame不太好理解,一般也很少用。它表示被信号打断时,Python的运行的栈帧对象信息。读者可以不必深度理解。<h2 style="margin-top:10px;margin-bottom:10px;padding:0px;font-size:21px;line-height:1.5;color:rgb(57,239);">多进程并行计算实例<p style="margin:10px auto;color:rgb(57,239);">下面我们使用多进程进行一个计算圆周率PI。对于圆周率PI有一个数学极限公式,我们将使用该公司来计算圆周率PI。<p style="margin:10px auto;color:rgb(57,239);"> def pi(n): s = 0.0 for i in range(n): s += 1.0/(2i+1)/(2i+1) return math.sqrt(8 s) print pi(10000000) <p style="margin:10px auto;color:rgb(57,239);">输出 <code class="language-css">3<span class="hljs-selector-class" style="margin:0px;padding:0px;">.14159262176<p style="margin:10px auto;color:rgb(57,239);">这个程序跑了有一小会才出结果,不过这个值已经非常接近圆周率了。<p style="margin:10px auto;color:rgb(57,239);">接下来我们用多进程版本,我们用redis进行进程间通信。<pre class="prettyprint hljs vim" style="margin-bottom:0px;padding:0px;white-space:pre-wrap;color:rgb(57,239);"># codin<span class="hljs-variable" style="margin:0px;padding:0px;color:#008000;">g: utf-<span class="hljs-number" style="margin:0px;padding:0px;">8 import os import sys import math import redis def slice(mink,maxk): s = <span class="hljs-number" style="margin:0px;padding:0px;">0.0 <span class="hljs-keyword" style="margin:0px;padding:0px;color:rgb(0,255);">for <span class="hljs-keyword" style="margin:0px;padding:0px;color:rgb(0,255);">k in <span class="hljs-built_in" style="margin:0px;padding:0px;color:rgb(0,255);">range(mink,maxk): s += <span class="hljs-number" style="margin:0px;padding:0px;">1.0/(<span class="hljs-number" style="margin:0px;padding:0px;">2<span class="hljs-keyword" style="margin:0px;padding:0px;color:rgb(0,255);">k+<span class="hljs-number" style="margin:0px;padding:0px;">1)/(<span class="hljs-number" style="margin:0px;padding:0px;">2<span class="hljs-keyword" style="margin:0px;padding:0px;color:rgb(0,255);">k+<span class="hljs-number" style="margin:0px;padding:0px;">1) <span class="hljs-keyword" style="margin:0px;padding:0px;color:rgb(0,255);">return s def pi(n): pids = [] unit = n / <span class="hljs-number" style="margin:0px;padding:0px;">10 client = redis.StrictRedis() client.<span class="hljs-keyword" style="margin:0px;padding:0px;color:rgb(0,255);">delete(<span class="hljs-string" style="margin:0px;padding:0px;color:rgb(163,21);">"result") # 保证结果集是干净的 del client # 关闭连接 <span class="hljs-keyword" style="margin:0px;padding:0px;color:rgb(0,255);">for i in <span class="hljs-built_in" style="margin:0px;padding:0px;color:rgb(0,255);">range(<span class="hljs-number" style="margin:0px;padding:0px;">10): # 分<span class="hljs-number" style="margin:0px;padding:0px;">10个子进程 mink = unit i maxk = mink + unit pid = os.fork() <span class="hljs-keyword" style="margin:0px;padding:0px;color:rgb(0,255);">if pid > <span class="hljs-number" style="margin:0px;padding:0px;">0: pids.<span class="hljs-keyword" style="margin:0px;padding:0px;color:rgb(0,255);">append(pid) <span class="hljs-keyword" style="margin:0px;padding:0px;color:rgb(0,255);">else: s = slice(mink,maxk) # 子进程开始计算 client = redis.StrictRedis() client.rpush(<span class="hljs-string" style="margin:0px;padding:0px;color:rgb(163,21);">"result",str(s)) # 传递子进程结果 sys.<span class="hljs-keyword" style="margin:0px;padding:0px;color:rgb(0,255);">exit(<span class="hljs-number" style="margin:0px;padding:0px;">0) # 子进程结束 <span class="hljs-keyword" style="margin:0px;padding:0px;color:rgb(0,255);">for pid in pid<span class="hljs-variable" style="margin:0px;padding:0px;color:#008000;">s: os.waitpid(pid,<span class="hljs-number" style="margin:0px;padding:0px;">0) # 等待子进程结束 sum = <span class="hljs-number" style="margin:0px;padding:0px;">0 client = redis.StrictRedis() <span class="hljs-keyword" style="margin:0px;padding:0px;color:rgb(0,255);">for s in client.lrange(<span class="hljs-string" style="margin:0px;padding:0px;color:rgb(163,<span class="hljs-number" style="margin:0px;padding:0px;">0,-<span class="hljs-number" style="margin:0px;padding:0px;">1): sum += float(s) # 收集子进程计算结果 <span class="hljs-keyword" style="margin:0px;padding:0px;color:rgb(0,255);">return math.<span class="hljs-built_in" style="margin:0px;padding:0px;color:rgb(0,255);">sqrt(sum * <span class="hljs-number" style="margin:0px;padding:0px;">8) <span class="hljs-keyword" style="margin:0px;padding:0px;color:rgb(0,255);">print pi(<span class="hljs-number" style="margin:0px;padding:0px;">10000000) <p style="margin:10px auto;color:rgb(57,239);">我们将级数之和的计算拆分成10个子进程计算,每个子进程负责1/10的计算量,并将计算的中间结果扔到redis的队列中,然后父进程等待所有子进程结束,再将队列中的数据全部汇总起来计算最终结果。<p style="margin:10px auto;color:rgb(57,239);">输出如下 <code class="language-css">3<span class="hljs-selector-class" style="margin:0px;padding:0px;">.14159262176<p style="margin:10px auto;color:rgb(57,239);">这个结果和单进程结果一致,但是花费的时间要缩短了不少。<p style="margin:10px auto;color:rgb(57,239);">这里我们之所以使用redis作为进程间通信方式,是因为进程间通信是一个比较复杂的技术,我们需要单独一篇文章来仔细讲,各位读者请耐心听我下回分解,我们将会使用进程间通信技术来替换掉这里的redis。<p style="margin:10px auto;color:rgb(57,239);">欢迎关注我的博客或者微信公众号:Python学习交流<p style="margin:10px auto;color:rgb(57,239);">换人加入我的千人交流学习群:125240963 (编辑:李大同) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! |