docker网络之bridge
建议阅读本文章之前了解一下文章,本文不作bridge的基本介绍 https://blog.csdn.net/u014027051/article/details/53908878/? http://williamherry.blogspot.com/2012/05/linux.html https://tonybai.com/2016/01/15/understanding-container-networking-on-single-host/ ? linux bridge:
ip netns add ns0
ip netns add ns1
ip link add veth0_ns0 type veth peer name veth_ns0 ip link add veth0_ns1 type veth peer name veth_ns1 查看netns下的网络,可以看到ns0和ns1分别新增接口veth0_ns0,veth0_ns1 [root@localhost home]# ip netns exec ns0 ip link
1: lo: <LOOPBACK> mtu 65536 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
12: veth0_ns0@if11: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether 82:87:07:8f:59:a9 brd ff:ff:ff:ff:ff:ff link-netnsid 0
[root@localhost home]# ip netns exec ns1 ip link
1: lo: <LOOPBACK> mtu 65536 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
14: veth0_ns1@if13: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether 9a:14:d8:63:56:45 brd ff:ff:ff:ff:ff:ff link-netnsid 0
host上查看接口信息,通过网卡序号可以看到veth0_ns0(12)和veth_ns0(11)为一对veth,veth0_ns1(14)和veth_ns1(13)为一对veth [root@localhost home]# ip link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: ens33: <BROADCAST,MULTICAST,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000
link/ether 00:0c:29:12:5d:af brd ff:ff:ff:ff:ff:ff
5: docker0: <NO-CARRIER,BROADCAST,UP> mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default
link/ether 02:42:37:84:0b:5f brd ff:ff:ff:ff:ff:fff
11: veth_ns0@if12: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether c2:e3:ef:a8:9c:08 brd ff:ff:ff:ff:ff:ff link-netnsid 2
13: veth_ns1@if14: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether a6:77:5d:48:10:81 brd ff:ff:ff:ff:ff:ff link-netnsid 3
ip netns exec ns0 ip addr add 1.1.1.1/24 dev veth0_ns0 ip netns exec ns0 ip link set dev veth0_ns0 up 查看ns0和ns1的网卡信息: [root@localhost home]# ip netns exec ns0 ip a 1: lo: <LOOPBACK> mtu 65536 qdisc noop state DOWN group default qlen 1000 link/loopback 00:00 brd 00 12: veth0_ns0@if11: <NO-CARRIER,UP> mtu 1500 qdisc noqueue state LOWERLAYERDOWN group default qlen link/ether 82:87:07:8f:59:a9 brd ff:ff:ff:ff:ff:ff link-netnsid 0 inet 1.1.1.1/24 scope global veth0_ns0 valid_lft forever preferred_lft forever 因为两个ns相互独立,此时ns0 ping ns1是ping不通的 [root@localhost home]# ip netns exec ns0 ping 1.2 PING 1.2 (1.2) 56(84) bytes of data.
ip link add br0 type bridge ip link set dev veth_ns0 up ip link set dev veth_ns0 master br0 查看br0信息,可以看到ns0和ns1的pair veth都已经连接到br0 [root@localhost home]# ip a show br0
15: br0: <BROADCAST,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether a6:77:5d:48:10:81 brd ff:ff:ff:ff:ff:ff
inet6 fe80::42e:2dff:fe70:43d7/64 scope link
valid_lft forever preferred_lft forever
[root@localhost home]# ip a show master br0
11: veth_ns0@if12: <BROADCAST,LOWER_UP> mtu 1500 qdisc noqueue master br0 state UP group default qlen 1000
link/ether c2:e3:ef:a8:9c:08 brd ff:ff:ff:ff:ff:ff link-netnsid 2
inet6 fe80::c0e3:efff:fea8:9c08/64 scope link
valid_lft forever preferred_lft forever
13: veth_ns1@if14: <BROADCAST,LOWER_UP> mtu 1500 qdisc noqueue master br0 state UP group default qlen 1000
link/ether a6:77:5d:48:10:81 brd ff:ff:ff:ff:ff:ff link-netnsid 3
inet6 fe80::a477:5dff:fe48:1081/64 scope link
valid_lft forever preferred_lft forever
ns0 ping ns1,此时可以ping 通 [root@localhost netns]# ip netns exec ns0 ping 1.1.1.2
PING 1.1.1.2 (1.1.1.2) 56(84) bytes of data.
64 bytes from 1.1.1.2: icmp_seq=1 ttl=64 time=0.148 ms
当前组网如下
[root@localhost home]# ip link show master br0
2: ens33: <BROADCAST,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000
link/ether 00:0c:29:12:5d:af brd ff:ff:ff:ff:ff:ff
11: veth_ns0@if12: <BROADCAST,LOWER_UP> mtu 1500 qdisc noqueue master br0 state UP mode DEFAULT group default qlen 1000
link/ether c2:e3:ef:a8:9c:08 brd ff:ff:ff:ff:ff:ff link-netnsid 2
13: veth_ns1@if14: <BROADCAST,LOWER_UP> mtu 1500 qdisc noqueue master br0 state UP mode DEFAULT group default qlen 1000
link/ether a6:77:5d:48:10:81 brd ff:ff:ff:ff:ff:ff link-netnsid 3
? 可以看到br0上已经存在ens33网卡,但ns0 ping 网关仍然失败,原因是当前br0上连接了2个ns的接口,而host的接口ens33接入br0之后其IP会失效,导致网络不通 [root@localhost netns]# ip netns exec ns0 192.168.80.2 -I veth0_ns0 PING 80.2 (80.2) from 1.1 veth0_ns0: 84) bytes of data. 一个简单的解决办法是给ns01添加一个与网关同网段的IP,并给br0设置与网关同网段的IP,配置如下 [root@localhost netns]# ip netns exec ns0 ip addr add 80.80/ dev veth0_ns0 [root@localhost netns]# ip netns exec ns1 ip addr add 80.81/ dev veth0_ns1 [root@localhost netns]# ip addr add 80.82/24 dev br0 这样ns0和ns1都可以ping 通网关,但这样有个问题就是会导致host主机无法与外界相接,并不是一个好的解决方案 [root@localhost netns]# ip netns exec ns0 80.80 veth0_ns0: 84) bytes of data. 64 bytes from 80.2: icmp_seq=1 ttl=128 time=0.236 ms 2 ttl=0.239 ms [root@localhost netns]# ip netns exec ns1 I veth0_ns1 PING 80.81 veth0_ns1: 0.2880.280 ms ? docker bridge: docker的netns在centos下的路径为:/var/run/docker/netns,每创建一个容器就会在该路径下生成一个对应的namespace文件,使用nsenter进入该ns可以看到与容器的网络信息是一样的 首先创建bridge网络,并启动两个docker [root@localhost home]# docker network create -d bridge --subnet 172.1.1.0/24 my_br 查看my_br情况如下,centos0 IP为172.1.1.2,centos1 IP为172.1.1.3,为my_br的子网内地址 [root@localhost home]# docker network inspect my_br
[
{
"Name": "my_br","Id": "f830aee4b13fa17479f850ea62d570ea61bc1c7d182a88010709a7285193bb64","Created": "2018-10-17T07:23:53.31341481+08:00","Scope": "local","Driver": "bridge","EnableIPv6": false,"IPAM": {
"Driver": "default","Options": {},"Config": [
{
"Subnet": "172.1.1.0/24"
}
]
},"Internal": false,"Attachable": false,"Containers": {
"03cda0f3fdd1fc65d198adb832998e11098bcc8c1bb5a8379f9c2ee82a14be07": {
"Name": "centos1","EndpointID": "d608d888da293967949340c1d946e92a6be06d525bcec611d0f20a6188de01ff","MacAddress": "02:42:ac:01:01:03","IPv4Address": "172.1.1.3/24","IPv6Address": ""
},"c739d26d51b08a36d3402e32fbe83656a7ac1b3f611a6c228f8ec80c84423439": {
"Name": "centos0","EndpointID": "9b38292d043fba31a5d04076c9d6a333c5beac08aba68dadeb84a5e17fed4dd6","MacAddress": "02:42:ac:01:01:02","IPv4Address": "172.1.1.2/24","IPv6Address": ""
}
},"Labels": {}
}
]
当然容器centos0是可以直接ping 网关的 [root@localhost home]# docker exec centos0 /bin/sh -c "ping 192.168.80.2"80.2) 127 0.2730.642 ms 查看centos0 centos1和host网卡信息,可以看到centos0的eth0与host的veth00f659d为veth pair,centos0的eth0与host的veth05377ae为另一veth pair [root@localhost home]# docker exec centos0 /bin/ip link" 1: lo: <LOOPBACK,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 7: eth0@if8: <BROADCAST,1)">1500 qdisc noqueue state UP mode DEFAULT group default link/ether 02:42:ac:01:02 brd ff:ff:ff:ff:ff:ff link-netnsid 0 [root@localhost home]# ip link 2: ens33: <BROADCAST,1)">1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 00:0c:29:12:5d:af brd ff:ff:ff:ff:ff:ff 3: virbr0: <NO-CARRIER,1)">1500 qdisc noqueue state DOWN mode DEFAULT group default qlen 52:54:48:2d:4c brd ff:ff:ff:ff:ff:ff 4: virbr0-nic: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast master virbr0 state DOWN mode DEFAULT group default qlen 5: docker0: <NO-CARRIER,1)"> qdisc noqueue state DOWN mode DEFAULT group default link/ether 42:37::0b:5f brd ff:ff:ff:ff:ff:ff 6: br-f830aee4b13f: <BROADCAST,1)">42:af:60:4b:4e brd ff:ff:ff:ff:ff:ff 8: veth00f659d@if7: <BROADCAST,1)">1500 qdisc noqueue master br-f830aee4b13f state UP mode DEFAULT group default link/ether 0e:45:69:f8:34:57 brd ff:ff:ff:ff:ff:ff link-netnsid 0 10: veth05377ae@if9: <BROADCAST,1)">f830aee4b13f state UP mode DEFAULT group default link/ether aa:ae:fc:5c:dd:06 brd ff:ff:ff:ff:ff:ff link-netnsid 1 查看centos0的路由,可以看到默认网关为172.1.1.1,该地址对应的网卡就是名为my_br的网桥 [root@localhost home]# docker exec centos0 /bin/bash -c ip route default via 1.1 dev eth0 24 dev eth0 proto kernel scope link src 1.2 [root@localhost home]# docker network ls NETWORK ID NAME DRIVER SCOPE 8678329d58ab bridge bridge local e8476b504e33 host host local f830aee4b13f my_br bridge local 96a70c1a9516 none null local host上的与centos0相关的路由如下,加粗的第一行为对内centos0的路由,加粗的第二行为对外路由 [root@localhost home]# ip route default via 80.2 dev ens33 proto dhcp metric 100 24 dev br-f830aee4b13f proto kernel scope link src 1.1 172.17.0.0/16 dev docker0 proto kernel scope link src 0.1 80.0/24 dev ens33 proto kernel scope link src 80.128 metric 100 同时查看与172.1.1.0相关的iptables,nat表有如下内容,即对源地址为172.1.1.0/24,出接口非网桥接口的报文进行MASQUERADE,将容器发过来的报文SNAT为host网卡地址 Chain POSTROUTING (policy ACCEPT 332 packets,21915 bytes) pkts bytes target prot opt in out source destination 0 0 MASQUERADE all -- * !br-f830aee4b13f 24 0.0.0 总结一下:centos0上ping外网网关(192.168.80.2)的处理流程:icmp报文目的地址为192.168.80.2,由于没有对应的路由,直接走默认路由,报文从容器的eth0发出去,进入到默认网关my_br(172.1.1.1),网桥my_br根据host的路由将目的地址为192.168.80.2的报文发送到ens33,同时将源地址使用MASQUERADE SNAT为ens33的地址。这就是docker bridge的报文转发流程。 ? 改造自定义的bridge 第一部分的网络方案是有缺陷的,它使得host主机的一个网口失效,根据对docker网络的分析,改造其网络如下:
ip link add veth0 type veth peer name veth1
ip link set dev veth0 up
ip link set dev veth1 up
ip link set dev veth1 master br0
[root@localhost home]# ip route add 24 via 1.3
ip netns exec ns0 ip route add default via 1.3 dev veth0_ns0
iptables -t nat -A POSTROUTING -s 24 ! -o br0 -j MASQUERADE
这样就构造了一个模仿docker bridge的网络,ns0就可以ping通外部网关了 [root@localhost home]# ip netns exec ns0 ip route default via 1.3 dev veth0_ns0 24 dev veth0_ns0 proto kernel scope link src [root@localhost home]# ip netns exec ns0 80.20.4390.533 ms 那么使用MASQUERADE时,iptables是怎么判断不同ping的返回包呢?此时host和ns0的icmp返回包有相同的地址和协议,且没有端口号。答案是通过?ip_conntrack识别icmp报文中的id字段来判断不同的ping进程,参见ICMP connections,ip_conntrack是实现NAT的基础。(新版本内核使用nf_conntrack) ? ?TIPS:
? 参考: https://blog.csdn.net/sld880311/article/details/77840343 https://docs.docker.com/network/bridge/#differences-between-user-defined-bridges-and-the-default-bridge http://success.docker.com/article/networking#dockerbridgenetworkdriver http://vinllen.com/linux-bridgeshe-ji-yu-shi-xian/ (编辑:李大同) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! |