Materijali za Virtualizacija na Linuxu -- jednostavan izbor zar ne?
Support for hardware virtualization:
egrep '^flags.*(vmx|svm)' /proc/cpuinfo
vmx/svm | no vmx/svn | |
USB | kvm | qemu+kqemu |
no USB | VirtualBox |
How much CPU do I use? :-)
dpavlin@brr:~$ cpufreq-info cpufrequtils 004: cpufreq-info (C) Dominik Brodowski 2004-2006 Report errors and bugs to cpufreq@lists.linux.org.uk, please. analyzing CPU 0: driver: acpi-cpufreq CPUs which need to switch frequency at the same time: 0 hardware limits: 2.40 GHz - 3.20 GHz available frequency steps: 3.20 GHz, 2.80 GHz, 2.40 GHz available cpufreq governors: userspace, powersave, ondemand, conservative, performance current policy: frequency should be within 2.40 GHz and 3.20 GHz. The governor "ondemand" may decide which speed to use within this range. current CPU frequency is 2.40 GHz. cpufreq stats: 3.20 GHz:1.80%, 2.80 GHz:0.00%, 2.40 GHz:98.20% (17)
Have many disks. More disk spindles brings more than capacity alone! (Same as in databases)
If you think that disk has constant transfer speed, ZCAV has interesting graphs
Slow laptop 2.5" 5400 disk
dpavlin@llin:~$ sudo hdparm -i /dev/sda /dev/sda: Model=FUJITSU MHV2080BH , FwRev=00840028, SerialNo= NW05T6B29HM5 Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs } RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=4 BuffType=DualPortCache, BuffSize=8192kB, MaxMultSect=16, MultSect=?16? CurCHS=16383/16/63, CurSects=16514064, LBA=yes, LBAsects=156301488 IORDY=on/off, tPIO={min:240,w/IORDY:120}, tDMA={min:120,rec:120} PIO modes: pio0 pio1 pio2 pio3 pio4 DMA modes: mdma0 mdma1 mdma2 UDMA modes: udma0 udma1 udma2 udma3 udma4 *udma5 AdvancedPM=yes: mode=0x80 (128) WriteCache=enabled Drive conforms to: unknown: ATA/ATAPI-3,4,5,6,7 * signifies the current active mode dpavlin@llin:~$ sudo hdparm -tT /dev/sda /dev/sda: Timing cached reads: 1566 MB in 2.00 seconds = 782.85 MB/sec Timing buffered disk reads: 66 MB in 3.03 seconds = 21.79 MB/sec
Interesting numbers are BuffSize (cache in disk) and MaxMultSect which we want to use for read-ahead param:
hdparm -m 16 -a 16 /dev/sda
This will decrease a bit speed of linerar buffer reads which hdparm uses, but we will pull from disk only blocks which are allready in cache, improving random read/write performance.
To find optimal readahead for your drive using hdparm access pattern you can use hdparm-readahead.pl which will try different combinations for you.
Faster (!) external 3.5 USB disk (no hdparm -i on USB), but just because it's another disk not loaded by system.
dpavlin@llin:~$ sudo hdparm -tT /dev/sdb /dev/sdb: Timing cached reads: 1508 MB in 2.00 seconds = 753.72 MB/sec Timing buffered disk reads: 56 MB in 3.03 seconds = 18.48 MB/sec
Home-made software md RAID 5 array from SATA drives:
Dobrica Pavlinusic posted a photo:
Dobrica Pavlinusic posted a photo:
Final position in case, notice empty space above 4th disk which was occupied by disks before (which didn't have enough airflow because of that)
Dobrica Pavlinusic posted a photo:
Dobrica Pavlinusic posted a photo:
Dobrica Pavlinusic posted a photo:
Dobrica Pavlinusic posted a photo:
Note nice usage of construction metal stripes with holes which is usually used to hold fence. It has holes just the right size for screws to go through and hold disks nicely spaced (although a little bit more space would be ideal). It's soft enough to be bent at corners to produce nice and leveled space between it and case.
Blog post RAID5 for home describes setup in some details.
Drive info:
dpavlin@brr:~$ sudo hdparm -i /dev/sdd /dev/sdd: Model=WDC WD5000AAKS-00YGA0 , FwRev=12.01C02, SerialNo= WD-WCAS80929678 Config={ HardSect NotMFM HdSw>15uSec SpinMotCtl Fixed DTR>5Mbs FmtGapReq } RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=50 BuffType=unknown, BuffSize=16384kB, MaxMultSect=16, MultSect=?16? CurCHS=16383/16/63, CurSects=16514064, LBA=yes, LBAsects=976773168 IORDY=on/off, tPIO={min:120,w/IORDY:120}, tDMA={min:120,rec:120} PIO modes: pio0 pio3 pio4 DMA modes: mdma0 mdma1 mdma2 UDMA modes: udma0 udma1 udma2 udma3 udma4 udma5 *udma6 AdvancedPM=no WriteCache=enabled Drive conforms to: Unspecified: ATA/ATAPI-1,2,3,4,5,6,7 * signifies the current active mode
Speed of individual drives in array:
dpavlin@brr:~$ sudo hdparm -tT /dev/sda /dev/sdb /dev/sdd /dev/sda: Timing cached reads: 1982 MB in 2.00 seconds = 991.18 MB/sec Timing buffered disk reads: 232 MB in 3.03 seconds = 76.67 MB/sec /dev/sdb: Timing cached reads: 2010 MB in 2.00 seconds = 1004.95 MB/sec Timing buffered disk reads: 228 MB in 3.01 seconds = 75.85 MB/sec /dev/sdd: Timing cached reads: 2006 MB in 2.00 seconds = 1003.01 MB/sec Timing buffered disk reads: 230 MB in 3.01 seconds = 76.47 MB/sec
How are hey assembled into /dev/md0 RAID 5 array:
dpavlin@brr:~$ cat /proc/mdstat Personalities : [raid6] [raid5] [raid4] md0 : active raid5 sdd1[0] sda1[2] sdb1[1] 976767872 blocks level 5, 64k chunk, algorithm 2 [3/3] [UUU]
Speed of array
dpavlin@brr:~$ sudo hdparm -tT /dev/md0 /dev/md0: Timing cached reads: 1986 MB in 2.00 seconds = 993.20 MB/sec Timing buffered disk reads: 434 MB in 3.01 seconds = 144.41 MB/sec
As expected RAID 5 speed is 75 + 75 + 0 (parity disk) ~ 144 MB/sec
Disks don't like it hot!
root@brr:~# hddtemp /dev/sda /dev/sdb /dev/sdd /dev/sda: WDC WD5000AAKS-00YGA0: 33°C /dev/sdb: WDC WD5000AAKS-00YGA0: 32°C /dev/sdd: WDC WD5000AAKS-00YGA0: 32°C
On output above, middle disk is /dev/sda so it's 1° hotter than other two. I could mitigate this with additional fan on front of case, but it's making enough noise already, so I'll leave it as is.
root@brr:~# smartctl --all /dev/sda | head -20 smartctl version 5.38 [i686-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen Home page is http://smartmontools.sourceforge.net/ === START OF INFORMATION SECTION === Model Family: Western Digital Caviar Second Generation Serial ATA family Device Model: WDC WD5000AAKS-00YGA0 Serial Number: WD-WCAS80815866 Firmware Version: 12.01C02 User Capacity: 500,107,862,016 bytes Device is: In smartctl database [for details use: -P show] ATA Version is: 8 ATA Standard is: Exact ATA specification draft version not indicated Local Time is: Sat Oct 11 00:27:01 2008 CEST SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED
Before you start to beleve in SMART as solution to all disk health problems, read Failure Trends in a Large Disk Drive Population
See also Bad block HOWTO for smartmontools if you ever get smart errors and don't just want to throw out your disk.
Also interesting is Some RAID Issues
Read also Why RAID 5 stops working in 2009
http://kvm.qumranet.com/kvmwiki/FAQ
sudo apt-get install kvm
http://kvm.qumranet.com/kvmwiki/Migration
Usually, you will use nfs for this. Edit /etc/exports and add something like (if your local network is 192.168.1.x):
/rest 192.168.1.0/255.255.255.0(rw)
And start nfs server
dpavlin@llin:~$ sudo /etc/init.d/nfs-user-server start
Mount shared storage and run qemu which will receive running machine
dpavlin@squeak:~$ mkdir mnt/rest dpavlin@squeak:~$ sudo mount 192.168.1.13:/rest mnt/rest/ dpavlin@squeak:~$ ls -al mnt/rest/iso/gparted-live-0.3.9-4.iso -rw-r--r-- 1 dpavlin dpavlin 98347008 Oct 9 17:31 mnt/rest/iso/gparted-live-0.3.9-4.iso dpavlin@squeak:~$ kvm -cdrom mnt/rest/iso/gparted-live-0.3.9-4.iso -incoming tcp://0:4444 -monitor stdio
dpavlin@llin:~$ kvm -m 128 -cdrom /rest/iso/gparted-live-0.3.9-4.iso -monitor stdio -no-kvm QEMU 0.9.1 monitor - type 'help' for more information (qemu) migrate tcp://192.168.1.30:4444
We use -no-kvm to disable kvm because our target machine doesn't have vmx|svm support!
Contents: [virtualization_workshop]
|
sudo apt-get install qemu kqemu-source
kqemu module compilation on Debian:
sudo module-assistant a-i kqemu
Seems to be best supported right now (package in Debian, optional drivers for Windows, starting unmodified VMWare machines -- after you guess right settings that is!)
OSE version (no USB!) comes in Debian, compile vboxdrv with:
root@llin:~# module-assistant a-i virtualbox-ose
VBoxManage createvm -name "VirtWorkshop" -register VBoxManage modifyvm VirtWorkshop -memory 512Mb -acpi on -boot1 dvd -nic1 nat -dvd /rest/iso/ScummVM\ Launcher\ 2.iso VBoxManage createvdi -filename hda-8Gb.vdi -size 8Mb -register VBoxManage modifyvm VirtWorkshop -hda hda-8Gb.vdi VBoxHeadless -startvm VirtWorkshop
sic this requires more than reading -h output:
dpavlin@x200:/virtual/win$ vboxmanage internalcommands converttoraw hda-winxp.vdi hda.img VirtualBox Command Line Management Interface Version 2.1.4_OSE (C) 2005-2009 Sun Microsystems, Inc. All rights reserved. Converting image "hda-winxp.vdi" with size 8589934592 bytes (8192MB) to raw...
OpenVZ is nice name-space virtualization, creating chroot jails on steroids, similar in spirit to Solaris zones. It ideal if you want to run single kernel and allocate resources using bean counters as opposed to hard-limits (20% of CPU as opposed to one core). Each slice is called VE.
dpavlin@zut:~$ sudo hdparm -tT /dev/cciss/c1d0 /dev/sda /dev/cciss/c1d0: Timing cached reads: 2184 MB in 2.00 seconds = 1092.39 MB/sec Timing buffered disk reads: 324 MB in 3.02 seconds = 107.40 MB/sec /dev/sda: Timing cached reads: 2144 MB in 2.00 seconds = 1071.89 MB/sec Timing buffered disk reads: 136 MB in 3.02 seconds = 45.02 MB/sec
Insert joke about enterprise storage
We are using normal Linux LVM with single logical volume for all VEs.
First, resize logical volume:
root@koha-hw:~# vgextend -L +80G /dev/vg/vz vgextend: invalid option -- L Error during parsing of command line. root@koha-hw:~# lvextend -L +80G /dev/vg/vz Extending logical volume vz to 100.00 GB Logical volume vz successfully resized root@koha-hw:~# resize2fs /dev/vg/vz resize2fs 1.40-WIP (14-Nov-2006) Filesystem at /dev/vg/vz is mounted on /vz; on-line resizing required old desc_blocks = 2, new_desc_blocks = 7 Performing an on-line resize of /dev/vg/vz to 26214400 (4k) blocks. The filesystem on /dev/vg/vz is now 26214400 blocks long. root@koha-hw:~# df -h /vz/ Filesystem Size Used Avail Use% Mounted on /dev/mapper/vg-vz 99G 20G 79G 21% /vz
Then, take a look how much space does VEs take:
root@koha-hw:~# vzlist -o veid,diskspace,diskspace.s,diskspace.h,diskinodes,diskinodes.s,diskspace.h VEID DQBLOCKS DQBLOCKS.S DQBLOCKS.H DQINODES DQINODES.S DQBLOCKS.H 212052 11717220 15728640 20971520 61001 286527 20971520 212226 6407804 10485760 12582912 69011 435472 12582912
alternativly, you can also execute df inside VEs:
root@koha-hw:~# vzlist -o veid -H | xargs -i sh -c "echo --{}-- ; vzctl exec {} df -h" --212052-- Filesystem Size Used Avail Use% Mounted on simfs 15G 12G 3.9G 75% / tmpfs 2.0G 0 2.0G 0% /lib/init/rw tmpfs 2.0G 0 2.0G 0% /dev/shm --212226-- Filesystem Size Used Avail Use% Mounted on simfs 10G 6.2G 3.9G 62% / tmpfs 2.0G 0 2.0G 0% /lib/init/rw tmpfs 2.0G 0 2.0G 0% /dev/shm
next, we will set diskpace on both VEs (becase we want them to share all available resources) to new logical volume size:
root@koha-hw:~# vzlist -o veid -H | xargs -i vzctl set {} --diskspace 100G:100G --save Saved parameters for VE 212052 Saved parameters for VE 212226
This VEs are not in production, and one is development version of another. When we move to production, we want to enforce more strict limit on disk usage, to protect production machine from running out of disk space in case the development one goes wild.
We usually want to do some operations on bunch of VEs at once. This can be done using vzctl exec in one sweep like this:
vzlist -H -o veid | xargs -i vzctl exec {} 'apt-get update && apt-get -y upgrade' 2>&1 | tee ~/log
You can read more about groupby.pl and sum.pl on my blog.
# install dependencies which are not part of standard lenny (sorry!) cpanp i IPC::System::Simple dpavlin@mjesec:~$ vzps -E axv --no-headers \ | groupby.pl 'sum:($7+$8+$9*1024),1,count:1' --join 'sudo vzlist -H -o veid,hostname' --on 2 \ | sort -rn | align | sum.pl -h webgui.rot13.org 23 1026M OOOOOOOOOOOO 1026M 0 385 855M OOOOOOOOOO------------ 1882M saturn.ffzg.hr 32 544M OOOOOO----------------------- 2427M eprints.ffzg.hr 18 351M OOOO----------------------------- 2778M arh.rot13.org 20 224M OO---------------------------------- 3003M
root@mljac:~# ps ax | grep getty | cut -c-5 | xargs vzpid Pid VEID Name 5668 0 getty 5670 0 getty 5672 0 getty 5673 0 getty 5674 0 getty 5675 0 getty 9503 207016 getty 9504 207013 getty 9505 207013 getty 9534 207016 getty 9535 207015 getty 9536 207013 getty 9537 207013 getty 9538 207015 getty 9539 207015 getty 9540 207015 getty 9541 207016 getty 9542 207015 getty 9543 207016 getty 9545 207013 getty 9546 207013 getty 9547 207015 getty 9548 207016 getty
For example, fuse
dpavlin@brr:/dev$ vzctl set 100 --devices c:10:229:rw --save
Suite of perl scripts in spirit of xen-tools but for OpenVZ
This step is optional. If you don't want to use perl modules from packages provided by your distribution, skip this step, and modules will be automatically installed in next one.
sudo apt-get install libio-prompt-perl libregexp-common-perl libdata-dump-perl
sudo apt-get install host
svn co svn://svn.rot13.org/vz-tools/trunk vz-tools
cd vz-tools perl Makefile.PL make
Please note that there is no need to run make install
Tools are runnable from current directory. This will probably change in later versions.
This is quick hand-on overview of commands to get you started.
All commands must be started with root priviledges
This will perform following steps:
All commands will be echoed on screen, even passwords. However, if you want to learn steps in creating OpenVZ VE, this is very helpful.
To run interactive session which asks questions use:
./vz-create.pl
Other alternative is to just enter hostname (defined in /etc/hosts for example)
./vz-create.pl my-new-ve.exmple.com
or by specifing IP adress
./vz-create.pl 192.168.42.42
root@black:~/vz-tools# time ./vz-clone.pl create 1001 Clone VE 1001 -> 101001 found LV /dev/vg/vz for /vz vzquota : (warning) Quota is running, so data reported from quota file may not reflect current values quota for 1001 | 10485760 < 20971520 | usage: 7826792 using existing /dev/vg/vz-clone-101001 Mounting /dev/vg/vz-clone-101001 to /tmp/vz-clone-101001 rsync /vz/private/1001 -> /tmp/vz-clone-101001/private 101001 new IP number: 10.42.42.42 101001 new hostname: clone-42.example.com Please review config file: /etc/vz/conf/101001.conf Add NAT for new VE with: iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE Start clone of 1001 with: vzctl start 101001 real 1m57.347s user 0m2.252s sys 0m8.591s
This format is supported by other emulators, so it's a best choice.
dpavlin@llin:/rest/vmware/winxp$ vmware-vdiskmanager -r Windows\ XP\ Professional.vmdk -t 0 /mnt/usb/vmware/win-xp.vmdk Using log file /tmp/vmware-dpavlin/vdiskmanager.log Creating a monolithic growable disk '/mnt/usb/vmware/win-xp.vmdk' Convert: 57% done.
Have in mind that vmware-vdiskmanger doesn't have a really helpful error messages:
dpavlin@tab:/mnt/brr/virtual/winxp$ vmware-vdiskmanager -r Windows\ XP\ Professional.vmdk -t 0 /virtual/win-xp.vmdk Using log file /tmp/vmware-dpavlin/vdiskmanager.log Creating a monolithic growable disk '/virtual/win-xp.vmdk' Failed to convert disk: The destination file system does not support large files (13).
is really permission denied !
dpavlin@llin:/mnt/usb/vmware$ qemu-img info win-xp.vmdk (VMDK) image open: flags=0x2 filename=win-xp.vmdk image: win-xp.vmdk file format: vmdk virtual size: 3.0G (3221225472 bytes) disk size: 3.0G
There is a way to extend image using only qemu-img, but that involves converting image to raw and appending zeros at end to produce larger image. However, we will do that using VMWare's vmware-vdiskmanager
dpavlin@llin:/mnt/usb/vmware$ vmware-vdiskmanager -x 6Gb win-xp.vmdk Using log file /tmp/vmware-dpavlin/vdiskmanager.log Grow: 100% done. The old geometry C/H/S of the disk is: 6241/16/63 The new geometry C/H/S of the disk is: 12483/16/63 Disk expansion completed successfully. WARNING: If the virtual disk is partitioned, you must use a third-party utility in the virtual machine to expand the size of the partitions. For more information, see: http://www.vmware.com/support/kb/enduser/std_adp.php?p_faqid=1647
This will make disk unbootable, so we will have to resize partition. Download GParted live CD and resize partition using it...
kvm -m 512 -hda win-xp.vmdk -no-acpi -std-vga -cdrom /rest/iso/gparted-live-0.3.9-4.iso -boot d
dpavlin@llin:/mnt/usb/vmware$ qemu-img convert -O qcow win-xp.vmdk win-xp.qcow (VMDK) image open: flags=0x2 filename=win-xp.vmdk dpavlin@llin:/mnt/usb/vmware$ ls -al win-xp.* -rw-r--r-- 1 dpavlin dpavlin 3190906880 Oct 9 17:41 win-xp.qcow -rw------- 1 dpavlin dpavlin 3208577024 Oct 9 17:35 win-xp.vmdk
this is domU
root@vega:~# uname -a Linux vega 2.6.18-6-xen-amd64 #1 SMP Mon Jun 16 23:42:47 UTC 2008 x86_64 GNU/Linux root@vega:~# hdparm -tT /dev/hda1 /dev/hda1: Timing cached reads: 5488 MB in 2.00 seconds = 2750.74 MB/sec Timing buffered disk reads: 318 MB in 3.00 seconds = 105.98 MB/sec
proxmox is bare metal Debian 64-bit installation supporting containers using OpenVZ and full system emulation using KVM.
OpenVZ isn't only linux container solution:
Remove them:
cd c:\windows\system32\drivers del agp440.sys del intelppm.dll
Startup script:
# 3M RFID 810 usbdev=0403:6001 sudo chown -R $USER /proc/bus/usb/* kvm -m 512 -hda win-xp.vmdk -no-acpi -std-vga -monitor stdio -usb -usbdevice host:$usbdev
USB sniffing:
info usbhost
It will not boot pass "Loading Nexenta..." stage without kvm module loaded.
# to install from iso image kvm -m 512 -hda solaris.vmdk -cdrom ../iso/nexenta-core-platform_1.0.1-b85-test4_x86.iso -boot d -net nic,model=rtl8139 -net user # run after installation kvm -m 512 -hda solaris.vmdk -net nic,model=rtl8139 -net user
Is Linux going wrong way with btrfs as solution to all storage problems? Linux and object storage devices