Virtualize Raspberry pi 3-s to run docker swarm cluster on it

If you have ever wanted to build a Docker Swarm Cluster on Raspberry PI-s, the first thing you have to do is to buy a bunch of RPI Hardware and only scalable for the level you money spend on hw, which all in all does not look an effective/affordable step for everybody who ever wanted to build a RPI Cluster to test/play/toy purposes.
With RPI3 64bit virtualization method you can abstract the Raspberry Hardware from the Operating System layer and you can run almost similar applications you would run on a physical Raspberry PI 3, the RPi3 Hardware is fully supported by QEMU AARCH64 to simulate cortex-a53!

Requirements:

  1. A pre-installed QEMU engine under Linux/Vmdk/Qcow image that capable to run ARM, in this specific use case we will simulate Raspberry PI3, that requires qemu-system-aarch64 – which is 64 bit and not the qemu-system-arm used, which was 32 bit.. You can find the guide how you can install qemu with aarch64/arm architecture:
    http://modernhackers.com/build-a-virtual-raspberry-pi-cluster/
  2. Debian Buster image for aarch64 architecture: https://people.debian.org/~stapelberg/raspberrypi3/2018-01-08/2018-01-08-raspberry-pi-3-buster-PREVIEW.img.xz
  3. Vmware Workstation to pre-stage the image for future GNS3 use
  4. unxz compression tools (apt-get install unxz)
  5. GNS3 Server where you will deploy the image

Download the aarch64 debian image

wget https://people.debian.org/~stapelberg/raspberrypi3/2018-01-08/2018-01-08-raspberry-pi-3-buster-PREVIEW.img.xz

Extract

unxz 2018-01-08-raspberry-pi-3-buster-PREVIEW.img.xz

Preparation for the image mount

fdisk -l 2018-01-08-raspberry-pi-3-buster-PREVIEW.img
Disk 2018-01-08-raspberry-pi-3-buster-PREVIEW.img: 1.1 GiB, 1153433600 bytes, 2252800 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0xac8dad98

Device                                        Boot  Start     End Sectors  Size Id Type
2018-01-08-raspberry-pi-3-buster-PREVIEW.img1        2048  614399  612352  299M  c W95 FAT32 (LBA)
2018-01-08-raspberry-pi-3-buster-PREVIEW.img2      614400 2252799 1638400  800M 83 Linux

You see that the filesystem (.img2) starts at sector 614400. Now take that value and multiply it by 512, in this case it’s 512 * 614400 = 314572800 bytes.

sudo mount -v -o offset=314572800 -t ext4 ./2018-01-08-raspberry-pi-3-buster-PREVIEW.img /mnt/debian

Edit the fstab in the mounted system like this

vi /mnt/debian/etc/fstab
# The root file system has fs_passno=1 as per fstab(5) for automatic fsck.
#/dev/mmcblk0p2 / ext4 rw 0 1
/dev/vda2 / ext4 rw 0 1
# All other file systems have fs_passno=2 as per fstab(5) for automatic fsck.
#/dev/mmcblk0p1 /boot/firmware vfat rw 0 2
proc /proc proc defaults 0 0

Second round of mounting, we will need to get out the kernel and initrd, otherwise qemu not able to run this image:

sudo mount -v -o offset=1048576 -t vfat ./2018-01-08-raspberry-pi-3-buster-PREVIEW.img /mnt/debian
mount: /dev/loop0 mounted on /mnt/debian.
cp /mnt/debian/vmlinuz-4.14.0-3-arm64 ./
cp /mnt/debian/initrd.img-4.14.0-3-arm64 ./
cp /mnt/debian/bcm2837-rpi-3-b.dtb ./
cp /mnt/debian/cmdline.txt ./
sudo umount /mnt/debian/

Pilot run your Virtual Raspberry Pi3 Image

sudo qemu-system-aarch64 \
  -kernel vmlinuz-4.14.0-3-arm64 \
  -initrd initrd.img-4.14.0-3-arm64 \
  -m 1024 -M virt \
  -cpu cortex-a53 \
  -append "rw root=/dev/vda2 console=ttyAMA0 loglevel=8 rootwait fsck.repair=yes memtest=1" \
  -drive file=image.qcow,format=qcow2,if=sd,id=hd-root \
  -device virtio-blk-device,drive=hd-root \
  -net nic \
  -net tap,ifname=tap0 \
  -nographic \
  -no-reboot

You should get the following output for the first run:

[    0.000000] Booting Linux on physical CPU 0x0
[    0.000000] Linux version 4.14.0-3-arm64 (debian-kernel@lists.debian.org) (gcc version 7.2.0 (Debian 7.2.0-18)) #1 SMP Debian 4.14.12-2 (2018-01-06)
[    0.000000] Boot CPU: AArch64 Processor [410fd034]
[    0.000000] Machine model: linux,dummy-virt
[    0.000000] efi: Getting EFI parameters from FDT:
[    0.000000] efi: UEFI not found.
[    0.000000] cma: Reserved 64 MiB at 0x000000007c000000
[    0.000000] NUMA: No NUMA configuration found
[    0.000000] NUMA: Faking a node at [mem 0x0000000000000000-0x000000007fffffff]
[    0.000000] NUMA: NODE_DATA [mem 0x7bfe8d80-0x7bfea87f]
[    0.000000] Zone ranges:
[    0.000000]   DMA      [mem 0x0000000040000000-0x000000007fffffff]
[    0.000000]   Normal   empty
[    0.000000] Movable zone start for each node
[    0.000000] Early memory node ranges
[    0.000000]   node   0: [mem 0x0000000040000000-0x000000007fffffff]
[    0.000000] Initmem setup node 0 [mem 0x0000000040000000-0x000000007fffffff]
[    0.000000] On node 0 totalpages: 262144
[    0.000000]   DMA zone: 4096 pages used for memmap
[    0.000000]   DMA zone: 0 pages reserved
[    0.000000]   DMA zone: 262144 pages, LIFO batch:31
[    0.000000] psci: probing for conduit method from DT.
[    0.000000] psci: PSCIv0.2 detected in firmware.
[    0.000000] psci: Using standard PSCI v0.2 function IDs
[    0.000000] psci: Trusted OS migration not required
[    0.000000] random: fast init done
[    0.000000] percpu: Embedded 22 pages/cpu @ffff80003bfbf000 s51608 r8192 d30312 u90112
[    0.000000] pcpu-alloc: s51608 r8192 d30312 u90112 alloc=22*4096
[    0.000000] pcpu-alloc: [0] 0
[    0.000000] Detected VIPT I-cache on CPU0
[    0.000000] CPU features: enabling workaround for ARM erratum 845719
[    0.000000] Built 1 zonelists, mobility grouping on.  Total pages: 258048
[    0.000000] Policy zone: DMA
[    0.000000] Kernel command line: rw root=/dev/vda2 console=ttyAMA0 loglevel=8 rootwait fsck.repair=yes memtest=1
[    0.000000] PID hash table entries: 4096 (order: 3, 32768 bytes)
[    0.000000] Memory: 929640K/1048576K available (8252K kernel code, 1448K rwdata, 2692K rodata, 4480K init, 601K bss, 53400K reserved, 65536K cma-reserved)
[    0.000000] Virtual kernel memory layout:
[    0.000000]     modules : 0xffff000000000000 - 0xffff000008000000   (   128 MB)
[    0.000000]     vmalloc : 0xffff000008000000 - 0xffff7dffbfff0000   (129022 GB)
[    0.000000]       .text : 0xffff000008080000 - 0xffff000008890000   (  8256 KB)
[    0.000000]     .rodata : 0xffff000008890000 - 0xffff000008b40000   (  2752 KB)
[    0.000000]       .init : 0xffff000008b40000 - 0xffff000008fa0000   (  4480 KB)
[    0.000000]       .data : 0xffff000008fa0000 - 0xffff00000910a200   (  1449 KB)
[    0.000000]        .bss : 0xffff00000910a200 - 0xffff0000091a0910   (   602 KB)
[    0.000000]     fixed   : 0xffff7dfffe7fd000 - 0xffff7dfffec00000   (  4108 KB)
[    0.000000]     PCI I/O : 0xffff7dfffee00000 - 0xffff7dffffe00000   (    16 MB)
[    0.000000]     vmemmap : 0xffff7e0000000000 - 0xffff800000000000   (  2048 GB maximum)
[    0.000000]               0xffff7e0000000000 - 0xffff7e0001000000   (    16 MB actual)
[    0.000000]     memory  : 0xffff800000000000 - 0xffff800040000000   (  1024 MB)
[    0.000000] ftrace: allocating 30760 entries in 121 pages
[    0.000000] Hierarchical RCU implementation.
[    0.000000]  RCU restricting CPUs from NR_CPUS=256 to nr_cpu_ids=1.
[    0.000000] RCU: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=1
[    0.000000] NR_IRQS: 64, nr_irqs: 64, preallocated irqs: 0
[    0.000000] GICv2m: range[mem 0x08020000-0x08020fff], SPI[80:143]
[    0.000000] arch_timer: cp15 timer(s) running at 62.50MHz (virt).
[    0.000000] clocksource: arch_sys_counter: mask: 0xffffffffffffff max_cycles: 0x1cd42e208c, max_idle_ns: 881590405314 ns
[    0.000302] sched_clock: 56 bits at 62MHz, resolution 16ns, wraps every 4398046511096ns
[    0.009914] Console: colour dummy device 80x25
[    0.012641] Calibrating delay loop (skipped), value calculated using timer frequency.. 125.00 BogoMIPS (lpj=250000)
[    0.012909] pid_max: default: 32768 minimum: 301
[    0.020385] Security Framework initialized
[    0.020642] Yama: disabled by default; enable with sysctl kernel.yama.*
[    0.024548] AppArmor: AppArmor initialized
[    0.027902] Dentry cache hash table entries: 131072 (order: 8, 1048576 bytes)
[    0.029240] Inode-cache hash table entries: 65536 (order: 7, 524288 bytes)
[    0.029646] Mount-cache hash table entries: 2048 (order: 2, 16384 bytes)
[    0.029768] Mountpoint-cache hash table entries: 2048 (order: 2, 16384 bytes)
[    0.082427] ASID allocator initialised with 65536 entries
[    0.084241] Hierarchical SRCU implementation.
[    0.096097] EFI services will not be available.
[    0.100621] smp: Bringing up secondary CPUs ...
[    0.100772] smp: Brought up 1 node, 1 CPU
[    0.100834] SMP: Total of 1 processors activated.
[    0.101024] CPU features: detected feature: 32-bit EL0 Support
[    0.102185] CPU: All CPU(s) started at EL1
[    0.103239] alternatives: patching kernel code
[    0.125618] devtmpfs: initialized
[    0.143398] Registered cp15_barrier emulation handler
[    0.143556] Registered setend emulation handler
[    0.145426] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645041785100000 ns
[    0.145708] futex hash table entries: 256 (order: 3, 32768 bytes)
[    0.156510] pinctrl core: initialized pinctrl subsystem
[    0.172962] DMI not present or invalid.
[    0.200830] NET: Registered protocol family 16
[    0.211350] cpuidle: using governor ladder
[    0.211624] cpuidle: using governor menu
[    0.213142] vdso: 2 pages (1 code @ ffff000008896000, 1 data @ ffff000008fa5000)
[    0.213525] hw-breakpoint: found 6 breakpoint and 4 watchpoint registers.
[    0.248506] DMA: preallocated 256 KiB pool for atomic allocations
[    0.251039] Serial: AMBA PL011 UART driver
[    0.301138] 9000000.pl011: ttyAMA0 at MMIO 0x9000000 (irq = 39, base_baud = 0) is a PL011 rev1
[    0.325616] console [ttyAMA0] enabled
[    0.395569] HugeTLB registered 2.00 MiB page size, pre-allocated 0 pages
[    0.429238] ACPI: Interpreter disabled.
[    0.435937] vgaarb: loaded
[    0.440160] EDAC MC: Ver: 3.0.0
[    0.443517] dmi: Firmware registration failed.
[    0.464842] clocksource: Switched to clocksource arch_sys_counter
[    0.772584] VFS: Disk quotas dquot_6.6.0
[    0.773664] VFS: Dquot-cache hash table entries: 512 (order 0, 4096 bytes)
[    0.784697] AppArmor: AppArmor Filesystem Enabled
[    0.786345] pnp: PnP ACPI: disabled
[    0.839478] NET: Registered protocol family 2
[    0.850840] TCP established hash table entries: 8192 (order: 4, 65536 bytes)
[    0.852147] TCP bind hash table entries: 8192 (order: 5, 131072 bytes)
[    0.853176] TCP: Hash tables configured (established 8192 bind 8192)
[    0.856111] UDP hash table entries: 512 (order: 2, 16384 bytes)
[    0.857244] UDP-Lite hash table entries: 512 (order: 2, 16384 bytes)
[    0.861344] NET: Registered protocol family 1
[    0.862523] PCI: CLS 0 bytes, default 128
[    0.871898] Unpacking initramfs...
[    4.090493] Freeing initrd memory: 18136K
[    4.096343] hw perfevents: enabled with armv8_pmuv3 PMU driver, 1 counters available
[    4.098513] kvm [1]: HYP mode not available
[    4.125313] audit: initializing netlink subsys (disabled)
[    4.135412] audit: type=2000 audit(3.792:1): state=initialized audit_enabled=0 res=1
[    4.138105] workingset: timestamp_bits=44 max_order=18 bucket_order=0
[    4.139633] zbud: loaded
[    6.425287] Key type asymmetric registered
[    6.426188] Asymmetric key parser 'x509' registered
[    6.427253] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 246)
[    6.429208] io scheduler noop registered
[    6.429674] io scheduler deadline registered
[    6.431361] io scheduler cfq registered (default)
[    6.431824] io scheduler mq-deadline registered
[    6.456480] pl061_gpio 9030000.pl061: PL061 GPIO chip @0x0000000009030000 registered
[    6.461967] OF: PCI: host bridge /pcie@10000000 ranges:
[    6.463397] OF: PCI:    IO 0x3eff0000..0x3effffff -> 0x00000000
[    6.464381] OF: PCI:   MEM 0x10000000..0x3efeffff -> 0x10000000
[    6.465118] OF: PCI:   MEM 0x8000000000..0xffffffffff -> 0x8000000000
[    6.466791] pci-host-generic 4010000000.pcie: ECAM at [mem 0x4010000000-0x401fffffff] for [bus 00-ff]
[    6.469428] pci-host-generic 4010000000.pcie: PCI host bridge to bus 0000:00
[    6.470670] pci_bus 0000:00: root bus resource [bus 00-ff]
[    6.471271] pci_bus 0000:00: root bus resource [io  0x0000-0xffff]
[    6.471688] pci_bus 0000:00: root bus resource [mem 0x10000000-0x3efeffff]
[    6.472246] pci_bus 0000:00: root bus resource [mem 0x8000000000-0xffffffffff]
[    6.474801] pci 0000:00:00.0: [1b36:0008] type 00 class 0x060000
[    6.531111] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
[    6.540327] Serial: AMBA driver
[    6.542637] msm_serial: driver initialized
[    6.546895] cacheinfo: Unable to detect cache hierarchy for CPU 0
[    6.552825] mousedev: PS/2 mouse device common for all mice
[    6.561496] rtc-pl031 9010000.pl031: rtc core: registered pl031 as rtc0
[    6.570512] ledtrig-cpu: registered to indicate activity on CPUs
[    6.571369] dmi-sysfs: dmi entry is absent.
[    6.575823] NET: Registered protocol family 10
[    6.589965] Segment Routing with IPv6
[    6.591302] mip6: Mobile IPv6
[    6.591770] NET: Registered protocol family 17
[    6.592429] mpls_gso: MPLS GSO support
[    6.597902] registered taskstats version 1
[    6.599788] zswap: loaded using pool lzo/zbud
[    6.604931] AppArmor: AppArmor sha1 policy hashing enabled
[    6.606231] ima: No TPM chip found, activating TPM-bypass! (rc=-19)
[    6.615443] rtc-pl031 9010000.pl031: setting system clock to 2019-11-03 19:56:46 UTC (1572811006)
[    6.629055] uart-pl011 9000000.pl011: no DMA platform data
[    7.319338] Freeing unused kernel memory: 4480K
Loading, please wait...
starting version 236
[   13.581485]  vda: vda1 vda2
Begin: Loading essential drivers ... done.
Begin: Running /scripts/init-premount ... done.
Begin: Mounting root file system ... Begin: Running /scripts/local-top ... done.
Begin: Running /scripts/local-premount ... done.
Warning: fsck not present, so skipping root file system
[   15.018740] EXT4-fs (vda2): mounted filesystem with ordered data mode. Opts: (null)
done.
Begin: Running /scripts/local-bottom ... done.
Begin: Running /scripts/init-bottom ... done.
[   16.889484] ip_tables: (C) 2000-2006 Netfilter Core Team
[   17.059641] systemd[1]: systemd 236 running in system mode. (+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD -IDN2 +IDN default-hierarchy=hybrid)
[   17.065245] systemd[1]: Detected virtualization qemu.
[   17.066555] systemd[1]: Detected architecture arm64.

Welcome to Debian GNU/Linux buster/sid!

[   17.109915] systemd[1]: Set hostname to <rpi3>.
[   20.124641] systemd[1]: File /lib/systemd/system/systemd-journald.service:35 configures an IP firewall (IPAddressDeny=any), but the local system does not support BPF/cgroup based firewalling.
[   20.131610] systemd[1]: Proceeding WITHOUT firewalling in effect! (This warning is only shown for the first loaded unit using IP firewalling.)
[   20.811853] systemd[1]: Created slice System Slice.
[  OK  ] Created slice System Slice.
[   20.824769] systemd[1]: Created slice system-serial\x2dgetty.slice.
[  OK  ] Created slice system-serial\x2dgetty.slice.
[   20.828637] systemd[1]: Reached target Swap.
[  OK  ] Reached target Swap.
[   20.836453] systemd[1]: Listening on /dev/initctl Compatibility Named Pipe.
[  OK  ] Listening on /dev/initctl Compatibility Named Pipe.
[  OK  ] Created slice User and Session Slice.
[  OK  ] Set up automount Arbitrary Executab…rmats File System Automount Point.
[  OK  ] Listening on Journal Audit Socket.
[  OK  ] Started Dispatch Password Requests to Console Directory Watch.
         Mounting Huge Pages File System...
[  OK  ] Listening on Journal Socket (/dev/log).
[  OK  ] Reached target Slices.
[  OK  ] Listening on fsck to fsckd communication Socket.
[  OK  ] Listening on udev Control Socket.
[  OK  ] Listening on Syslog Socket.
         Mounting Kernel Debug File System...
[  OK  ] Reached target Remote File Systems.
[  OK  ] Listening on udev Kernel Socket.
[  OK  ] Listening on Journal Socket.
         Starting resize root file system...
         Starting udev Coldplug all Devices...
         Starting Journal Service...
[  OK  ] Created slice system-getty.slice.
         Mounting POSIX Message Queue File System...
[  OK  ] Started Forward Password Requests to Wall Directory Watch.
[  OK  ] Reached target Local Encrypted Volumes.
[  OK  ] Reached target Paths.
         Starting Load Kernel Modules...
         Starting Create list of required st…ce nodes for the current kernel...
[   22.260366] systemd[1]: Mounted Kernel Debug File System.
[  OK  ] Mounted Kernel Debug File System.
[   22.291369] systemd[1]: Mounted Huge Pages File System.
[  OK  ] Mounted Huge Pages File System.
[   22.545402] systemd[1]: Mounted POSIX Message Queue File System.
[  OK  ] Mounted POSIX Message Queue File System.
[   22.864128] systemd[1]: Started Create list of required static device nodes for the current kernel.
[  OK  ] Started Create list of required sta…vice nodes for the current kernel.
[   23.024289] systemd[1]: Starting Create Static Device Nodes in /dev...
         Starting Create Static Device Nodes in /dev...
[   23.096069] systemd[1]: Started Load Kernel Modules.
[  OK  ] Started Load Kernel Modules.
[   23.258059] systemd[1]: Starting Apply Kernel Variables...
         Starting Apply Kernel Variables...
[   24.225745] systemd[1]: Started Apply Kernel Variables.
[  OK  ] Started Apply Kernel Variables.
[   24.587147] systemd[1]: Started Create Static Device Nodes in /dev.
[  OK  ] Started Create Static Device Nodes in /dev.
[   24.670161] systemd[1]: Started Journal Service.
[  OK  ] Started Journal Service.
[FAILED] Failed to start resize root file system.
See 'systemctl status rpi3-resizerootfs.service' for details.
[DEPEND] Dependency failed for Remount Root and Kernel File Systems.
         Starting Load/Save Random Seed...
         Starting udev Kernel Device Manager...
         Starting Flush Journal to Persistent Storage...
[  OK  ] Reached target Local File Systems (Pre).
[  OK  ] Reached target Local File Systems.
         Starting AppArmor initialization...
[  OK  ] Started Load/Save Random Seed.
[   27.039113] systemd-journald[143]: Received request to flush runtime journal from PID 1
[  OK  ] Started Flush Journal to Persistent Storage.
         Starting Create Volatile Files and Directories...
[  OK  ] Started udev Kernel Device Manager.
[  OK  ] Started udev Coldplug all Devices.
[  OK  ] Started Create Volatile Files and Directories.
         Starting Update UTMP about System Boot/Shutdown...
         Starting Network Time Synchronization...
[  OK  ] Started Update UTMP about System Boot/Shutdown.
[  OK  ] Found device /dev/ttyAMA0.
[   33.993816] input: gpio-keys as /devices/platform/gpio-keys/input/input0
[  OK  ] Started AppArmor initialization.
         Starting Raise network interfaces...
[FAILED] Failed to start Network Time Synchronization.
See 'systemctl status systemd-timesyncd.service' for details.
[  OK  ] Stopped Network Time Synchronization.
         Starting Network Time Synchronization...
[FAILED] Failed to start Network Time Synchronization.
See 'systemctl status systemd-timesyncd.service' for details.
[  OK  ] Stopped Network Time Synchronization.
         Starting Network Time Synchronization...
[FAILED] Failed to start Network Time Synchronization.
See 'systemctl status systemd-timesyncd.service' for details.
[  OK  ] Stopped Network Time Synchronization.
         Starting Network Time Synchronization...
[FAILED] Failed to start Network Time Synchronization.
See 'systemctl status systemd-timesyncd.service' for details.
[  OK  ] Stopped Network Time Synchronization.
[   45.025148] ip6_tables: (C) 2000-2006 Netfilter Core Team
         Starting Network Time Synchronization...
[FAILED] Failed to start Network Time Synchronization.
See 'systemctl status systemd-timesyncd.service' for details.
[  OK  ] Stopped Network Time Synchronization.
         Starting Network Time Synchronization...
[FAILED] Failed to start Network Time Synchronization.
See 'systemctl status systemd-timesyncd.service' for details.
[  OK  ] Stopped Network Time Synchronization.
         Starting Network Time Synchronization...
[FAILED] Failed to start Network Time Synchronization.
See 'systemctl status systemd-timesyncd.service' for details.
[  OK  ] Stopped Network Time Synchronization.
[FAILED] Failed to start Network Time Synchronization.
See 'systemctl status systemd-timesyncd.service' for details.
[  OK  ] Reached target System Initialization.
[  OK  ] Started Daily Cleanup of Temporary Directories.
[  OK  ] Listening on D-Bus System Message Bus Socket.
[  OK  ] Reached target Sockets.
[  OK  ] Reached target Basic System.
         Starting Login Service...
         Starting generate SSH host keys...
[  OK  ] Started D-Bus System Message Bus.
         Starting WPA supplicant...
         Starting System Logging Service...
[  OK  ] Started Regular background program processing daemon.
[  OK  ] Started irqbalance daemon.
[  OK  ] Reached target System Time Synchronized.
[  OK  ] Started Daily apt download activities.
[  OK  ] Started Daily apt upgrade and clean activities.
[  OK  ] Reached target Timers.
[  OK  ] Started Raise network interfaces.
[  OK  ] Started Login Service.
[  OK  ] Started WPA supplicant.
[  OK  ] Reached target Network.
         Starting Permit User Sessions...
         Starting OpenBSD Secure Shell server...
[  OK  ] Started System Logging Service.
[  OK  ] Started Permit User Sessions.
[  OK  ] Started Getty on tty1.
[  OK  ] Started Serial Getty on ttyAMA0.
[  OK  ] Reached target Login Prompts.
[  OK  ] Started OpenBSD Secure Shell server.

Debian GNU/Linux buster/sid rpi3 ttyAMA0

rpi3 login:

The default login is root/raspberry

Customize the Virtual Environment / abstract from real/phyisical

Remove unstable apt repostitory, keep just buster stable repository on

root@rpi3:/etc/apt# cat sources.list
deb http://deb.debian.org/debian buster main contrib non-free
# deb http://deb.debian.org/debian unstable main contrib non-free

Prevent initramfs to be automatically updated – set update_initramfs=no

root@rpi3:# cat /etc/initramfs-tools/update-initramfs.conf
#
# Configuration file for update-initramfs(8)
#

#
# update_initramfs [ yes | all | no ]
#
# Default is yes
# If set to all update-initramfs will update all initramfs
# If set to no disables any update to initramfs beside kernel upgrade

update_initramfs=no

#
# backup_initramfs [ yes | no ]
#
# Default is no
# If set to no leaves no .bak backup files.

backup_initramfs=no

Install Docker

apt-get update
apt-get install \
    apt-transport-https \
    ca-certificates \
    curl \
    gnupg2 \
    software-properties-common

Add Docker Official GPG keys

curl -fsSL https://download.docker.com/linux/debian/gpg | apt-key add -

Add amd64 docker repository

add-apt-repository \
   "deb [arch=arm64] https://download.docker.com/linux/debian \
   $(lsb_release -cs) \
   stable"

Apt-get update again after extending the repository

apt-get update

Install Docker to 64 bit arm architecture:

apt-get install docker-ce docker-ce-cli containerd.io

After installation docker won’t start up, validate that docker is working without systemd with any of the following direct start-up commands:

/usr/bin/dockerd -H unix://
or
dockerd --debug

If you did not see any real error that you need to fix first and docker started up, you can reboot the system and docker will start-up automatically and you can run docker ps command for example.

If you validated Docker is running, you can do your packaging to the virtualization engine that will host your cluster, in my case I am running the mentioned .vmdk image under GNS3

If you familiar with the previous article’s pi virtualization guide (
http://modernhackers.com/build-a-virtual-raspberry-pi-cluster/ ) , similar like there, you will need to setup the auto-start parameters of your raspberry pi3 – after GNS3 loaded up your image, the qemu-system-aarch64 binary should start-up.

Additionally you need to have unique mac address for each QEMU instance and you can start withe the following method adding the start-up method of your shell .profile

MACADDR="52:54:00:$(dd if=/dev/urandom bs=512 count=1 2>/dev/null | md5sum | sed 's/^\(..\)\(..\)\(..\).*$/\1:\2:\3/')";
sudo qemu-system-aarch64 -kernel vmlinuz-4.14.0-3-arm64 -initrd initrd.img-4.14.0-3-arm64 -m 1024 -M virt -cpu cortex-a53 -append "rw root=/dev/vda2 console=ttyAMA0 loglevel=8 rootwait fsck.repair=yes memtest=1" -drive file=image.qcow,format=qcow2,if=sd,id=hd-root -device virtio-blk-device,drive=hd-root -net nic,macaddr=$MACADDR -net tap,ifname=tap0 -nographic -no-reboot

If you are under you have uploaded your Qemu image to GNS3, you can now scale your cluster.

On the first RPI node, which will be later on the Swarm manager, you should install first a Docker visualizer Tool:

docker run -it -d -p 8080:8080 -v /var/run/docker.sock:/var/run/docker.sock layer2/visualizer-aarch64

After that you can access from your host by typing http://rpi-ip/8080

Initialize the Docker Swarm Cluster on your first (master node)

root@rpi3:# docker swarm init --advertise-addr 192.168.2.130
Swarm initialized: current node (p23fe3twd75q4h9rmtob5svdv) is now a manager.

To add a worker to this swarm, run the following command:

    docker swarm join --token SWMTKN-1-69d0d6el2a1r4xdqx6glgjtkxjdzeazq05ymy0zbg8ocop6h5a-7z9llgasts61rcqfa5c5z5v8a 192.168.2.130:2377

To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions.

Add more Virtual Raspberry Pi 3-s to your GNS3 Virtual Lab architecture like this

You will need to issue the join command on all RPI-s to join into the swarm cluster you created first.

root@rpi3:~# docker swarm join --token SWMTKN-1-69d0d6el2a1r4xdqx6glgjtkxjdzeazq05ymy0zbg8ocop6h5a-7z9llgasts61rcqfa5c5z5v8a 192.168.2.130:2377
This node joined a swarm as a worker.

Stress Test your Virtual Raspberry Docker Cluster with a hello world container

root@rpi3:~# docker service create --name test --replicas 20 busybox:latest sh -c "while true; do echo Hello; sleep 2; done"
s77aa6svtbg0ov5l1kv56iagu
overall progress: 19 out of 20 tasks
1/20: running   [==================================================>]
2/20: running   [==================================================>]
3/20: running   [==================================================>]
4/20: running   [==================================================>]
5/20: running   [==================================================>]
6/20: running   [==================================================>]
7/20: running   [==================================================>]
8/20: running   [==================================================>]
9/20: running   [==================================================>]
10/20:
11/20: running   [==================================================>]
12/20: starting  [============================================>      ]
13/20: running   [==================================================>]
14/20: running   [==================================================>]
15/20: running   [==================================================>]
16/20: running   [==================================================>]
17/20: running   [==================================================>]
18/20: running   [==================================================>]
19/20: running   [==================================================>]
20/20: running   [==================================================>]

Over stressed one node, turned out that one node had lowered memory setup in GNS3 and QEMU SYSTEM AARCH64 daemon stopped behind

But the requested service stretched, regardless of the dead node, it stretched to rest of 4.

overall progress: 20 out of 20 tasks
1/20: running   [==================================================>]
2/20: running   [==================================================>]
3/20: running   [==================================================>]
4/20: running   [==================================================>]
5/20: running   [==================================================>]
6/20: running   [==================================================>]
7/20: running   [==================================================>]
8/20: running   [==================================================>]
9/20: running   [==================================================>]
10/20: running   [==================================================>]
11/20: running   [==================================================>]
12/20: running   [==================================================>]
13/20: running   [==================================================>]
14/20: running   [==================================================>]
15/20: running   [==================================================>]
16/20: running   [==================================================>]
17/20: running   [==================================================>]
18/20: running   [==================================================>]
19/20: running   [==================================================>]
20/20: running   [==================================================>]
verify: Service converged

Let’s fix the broken node and increase the host’s x86 /node memory to 2G, that runs the Qemu-system-aarch64 emulator – aarch64 consuming directly 1G memory/node. Originally 1.5G was the host memory, that not enough, 2G needed for RPI3 emulation.

Start-up the broken RPI node, rejoining the RPI3-3 node to the swarm cluster

Faulty node joined back

root@rpi3:~# docker service create --name test --replicas 20 busybox:latest sh -c "while true; do echo Hello; sleep 2; done"
qjpsar9q1s8r7te42zawc68dn
overall progress: 0 out of 20 tasks
1/20: accepted  [===========================>                       ]
2/20: starting  [============================================>      ]
3/20: preparing [=================================>                 ]
4/20: preparing [=================================>                 ]
5/20: accepted  [===========================>                       ]
6/20: assigned  [======================>                            ]
7/20: new       [=====>                                             ]
8/20: preparing [=================================>                 ]
9/20: preparing [=================================>                 ]
10/20: preparing [=================================>                 ]
11/20: ready     [======================================>            ]
12/20: preparing [=================================>                 ]
13/20: assigned  [======================>                            ]
14/20: assigned  [======================>                            ]
15/20: accepted  [===========================>                       ]
16/20: assigned  [======================>                            ]
17/20: preparing [=================================>                 ]
18/20: accepted  [===========================>                       ]
19/20: preparing [=================================>                 ]
20/20: assigned  [======================>                            ]
overall progress: 15 out of 20 tasks
1/20: running   [==================================================>]
2/20: running   [==================================================>]
3/20: running   [==================================================>]
4/20: running   [==================================================>]
5/20: running   [==================================================>]
6/20: running   [==================================================>]
7/20: running   [==================================================>]
8/20: running   [==================================================>]
9/20: starting  [============================================>      ]
10/20: starting  [============================================>      ]
11/20: running   [==================================================>]
12/20: running   [==================================================>]
13/20: running   [==================================================>]
14/20: running   [==================================================>]
15/20: running   [==================================================>]
16/20: starting  [============================================>      ]
17/20: running   [==================================================>]
18/20: starting  [============================================>      ]
19/20: starting  [============================================>      ]
20/20: running   [==================================================>]

Share with: