iaas Documentation Release 0.1.0 NorCAMS



Like dokumenter
Windows Server 2008 Hyper-V, Windows Server 2008 Server Core Installation Notes

GLOBALCOMSERVER HP 9100C DIGITAL SENDER GATEWAY ADMINISTRATOR S GUIDE 1998 AVM INFORMATIQUE (UPDATED: AUGUST 22, 2006)

Unit Relational Algebra 1 1. Relational Algebra 1. Unit 3.3

Hybrid Cloud and Datacenter Monitoring with Operations Management Suite (OMS)

Slope-Intercept Formula

HONSEL process monitoring

Han Ola of Han Per: A Norwegian-American Comic Strip/En Norsk-amerikansk tegneserie (Skrifter. Serie B, LXIX)

Endelig ikke-røyker for Kvinner! (Norwegian Edition)

Den europeiske byggenæringen blir digital. hva skjer i Europa? Steen Sunesen Oslo,

The regulation requires that everyone at NTNU shall have fire drills and fire prevention courses.

of color printers at university); helps in learning GIS.

Bostøttesamling

Start Here USB *CC * *CC * USB USB

Hvordan føre reiseregninger i Unit4 Business World Forfatter:

Elektronisk innlevering/electronic solution for submission:

Trådløsnett med Windows XP. Wireless network with Windows XP

Trådløsnett med. Wireless network. MacOSX 10.5 Leopard. with MacOSX 10.5 Leopard

Norsk (English below): Guide til anbefalt måte å printe gjennom plotter (Akropolis)

Sascha Schubert Product Manager Data Mining SAS International Copyright 2006, SAS Institute Inc. All rights reserved.

REMOVE CONTENTS FROM BOX. VERIFY ALL PARTS ARE PRESENT READ INSTRUCTIONS CAREFULLY BEFORE STARTING INSTALLATION

E-Learning Design. Speaker Duy Hai Nguyen, HUE Online Lecture

Information search for the research protocol in IIC/IID

Exercise 1: Phase Splitter DC Operation

Prosjektet Digital kontaktinformasjon og fullmakter for virksomheter Digital contact information and mandates for entities

1. Installasjon av SharePoint 2013

Public roadmap for information management, governance and exchange SINTEF

SmartPass Mini User Manual BBNORGE.NO

Trådløsnett med Windows Vista. Wireless network with Windows Vista

EXAM TTM4128 SERVICE AND RESOURCE MANAGEMENT EKSAM I TTM4128 TJENESTE- OG RESSURSADMINISTRASJON

PSi Apollo. Technical Presentation

UNIVERSITETET I OSLO ØKONOMISK INSTITUTT

EN Skriving for kommunikasjon og tenkning

Issues and challenges in compilation of activity accounts

INSTALLATION GUIDE FTR Cargo Rack Regular Ford Transit 130" Wheelbase ( Aluminum )

Baltic Sea Region CCS Forum. Nordic energy cooperation perspectives

Reliable RT Spotify

Trådløst nett UiT Feilsøking. Wireless network UiT Problem solving

Little Mountain Housing

Administrasjon av postnummersystemet i Norge Post code administration in Norway. Frode Wold, Norway Post Nordic Address Forum, Iceland 5-6.

Capturing the value of new technology How technology Qualification supports innovation

Server-Side Eclipse. Bernd Kolb Martin Lippert it-agile GmbH

FIRST LEGO League. Härnösand 2012

Multimedia in Teacher Training (and Education)

User Guide

Software applications developed for the maritime service at the Danish Meteorological Institute

TEKSTER PH.D.-VEILEDERE FREMDRIFTSRAPPORTERING DISTRIBUSJONS-E-POST TIL ALLE AKTUELLE VEILEDERE:

Innovasjonsvennlig anskaffelse

Model Quick Start Guide

Mål med prosjektet. proactima.com. Utvikle, markedsføre og selge den beste løsningen for Risikostyring og HMS ledelse for det globale markedet

Trigonometric Substitution

INSTALLATION GUIDE FTR Cargo Rack Regular Ford Transit 130" Wheelbase ( Aluminum )

ISO 41001:2018 «Den nye læreboka for FM» Pro-FM. Norsk tittel: Fasilitetsstyring (FM) - Ledelsessystemer - Krav og brukerveiledning

Server-Side Eclipse. Martin Lippert akquinet agile GmbH

EMPIC MEDICAL. Etterutdanningskurs flyleger 21. april Lars (Lasse) Holm Prosjektleder Telefon: E-post:

Oppgave 1a Definer følgende begreper: Nøkkel, supernøkkel og funksjonell avhengighet.

Hvordan komme i gang med ArchiMate? Det første modelleringsspråket som gjør TOGAF Praktisk

Moving Objects. We need to move our objects in 3D space.

Western Alaska CDQ Program. State of Alaska Department of Community & Economic Development

MID-TERM EXAM TDT4258 MICROCONTROLLER SYSTEM DESIGN. Wednesday 3 th Mars Time:

TEKSTER PH.D.-KANDIDATER FREMDRIFTSRAPPORTERING

Bestille trykk av doktoravhandling Ordering printing of PhD Thesis

Kurskategori 2: Læring og undervisning i et IKT-miljø. vår

Perpetuum (im)mobile

Emneevaluering GEOV272 V17

Neural Network. Sensors Sorter

Forecast Methodology September LightCounting Market Research Notes

Øystein Haugen, Professor, Computer Science MASTER THESES Professor Øystein Haugen, room D

Hvor mye teoretisk kunnskap har du tilegnet deg på dette emnet? (1 = ingen, 5 = mye)

Databases 1. Extended Relational Algebra

Rom-Linker Software User s Manual

Bilag 1 Kundens kravspesifikasjon. Bilag 1:Kundens kravspesifikasjon. Kategori 1: Teknisk Kravspesifikasjon. Avtalen punkt 1.

2A September 23, 2005 SPECIAL SECTION TO IN BUSINESS LAS VEGAS

GEO231 Teorier om migrasjon og utvikling

Assignment. Consequences. assignment 2. Consequences fabulous fantasy. Kunnskapsløftets Mål Eleven skal kunne

Den som gjør godt, er av Gud (Multilingual Edition)

Trust in the Personal Data Economy. Nina Chung Mathiesen Digital Consulting

CSR Harvesting Final Meeting September, 2015 Brest, France. Anne Che-Bohnenstengel & Matthias Pramme, BSH

Digital Transformasjon

Invitation to Tender FSP FLO-IKT /2013/001 MILS OS

Microsoft Dynamics C5 Version 2008 Oversigt over Microsoft Reporting Services rapporter

En praktisk anvendelse av ITIL rammeverket

Examination paper for TDT4252 and DT8802 Information Systems Modelling Advanced Course

Generalization of age-structured models in theory and practice

From Policy to personal Quality

Call function of two parameters

Improving Customer Relationships

Trådløst nett UiT. Feilsøking. Wireless network UiT Problem solving

Introduction to DK- CERT Vulnerability Database

Host #1: Nationality: Year of birth: Host #2: Nationality: Year of birth: Nearest city/town: Nearest airport: Nearest train station:

Kundetilfredshetsundersøkelse FHI/SMAP

PATIENCE TÅLMODIGHET. Is the ability to wait for something. Det trenger vi når vi må vente på noe

// Translation // KLART SVAR «Free-Range Employees»

The Future of Academic Libraries the Road Ahead. Roy Gundersen

Erfaringer fra en Prosjektleder som fikk «overflow»

80. Lincoln to Parklands

Den som gjør godt, er av Gud (Multilingual Edition)

SJEKKESKOLEN: EN STEG-FOR-STEG GUIDE TIL TILTREKNING AV FANTASTISKE JENTER (NORWEGIAN EDITION) BY ANDREAS GODE VIBBER

European Crime Prevention Network (EUCPN)

2018 ANNUAL SPONSORSHIP OPPORTUNITIES

Andrew Gendreau, Olga Rosenbaum, Anthony Taylor, Kenneth Wong, Karl Dusen

Transkript:

iaas Documentation Release 0.1.0 NorCAMS June 18, 2015

Contents 1 Getting started 3 2 Installation 5 2.1 Hardware installation........................................... 5 2.2 IP Addressing Plan............................................ 11 3 Design 13 3.1 Development hardware requirements.................................. 13 4 Development 15 5 Howtos and guides 17 5.1 Build docs locally using Sphinx..................................... 17 5.2 Git in the real world........................................... 18 5.3 Install KVM on CentOS 7 from minimal install............................. 18 5.4 Configure a Dell S55 FTOS switch from scratch............................ 18 5.5 Install cumulus linux on ONIE enabled Dell S4810........................... 19 5.6 Configure idrac-settings on Dell 13g servers with USB stick..................... 20 5.7 Using vncviewer to access the console.................................. 21 5.8 How to boostrap a Foreman-instance.................................. 21 6 About the project 23 6.1 What is NorCAMS anyways?...................................... 23 6.2 People.................................................. 24 6.3 Tracking the project........................................... 24 6.4 Project plan and description....................................... 25 6.5 Reports.................................................. 32 6.6 Meetings................................................. 36 i

ii

This is our current documentation. Contents 1

2 Contents

CHAPTER 1 Getting started 3

4 Chapter 1. Getting started

CHAPTER 2 Installation Documentation describing installation of the IaaS platform 2.1 Hardware installation This documents how to install the hardware for our development phase. 2.1.1 Placement of equipment in racks We are installing the equipment in two separate racks to provide for easier future expansion. Installation guide The six controller and compute nodes are installed in rack 1, the five storage nodes in the other. Each rack has a pair of layer 3 capable switches at the top. There is only a single management switch, a the top of rack 1, but if expanding the setup we ll have one management switch in each rack. 5

6 Chapter 2. Installation

Single rack installation (not used) We also documented an optional single rack installation setup. In this setup the physical boxes outlined in red are connected to routers loc-leaf-03 and 04. We decided not to use this setup because all the sites got enough space for two racks. 2.1. Hardware installation 7

8 Chapter 2. Installation

2.1.2 Cabling All of the servers will be connected with dual 10G fiber links, a single 1G ethernet cable for management (PXE) and a 1G ethernet for the bare metal controller (BMC). The layer 3 switches will be cabled as simple as possible for the first minimal product installation. Later we ll increase the number of physical connections for testing more designs and solutions. Connection sheet A LibreOffice spreadsheet documents all the connections between the hosts and devices. It is also available in HTML format. Conceptual overview This illustrates the cabling concept with a minmal solution for switch connectivity. This is the same concept as above, but with additional switch cabling to to enable testing of different network design and redundancy scenarios. 2.1. Hardware installation 9

2.1.3 Host network logical concept This shows an overview of the different logical networks planned. Service endpoint traffic is statically routed from the location core to the IaaS transport network. The transport network itself is internally an OSPF routed fabric, designed first as a fully-connected mesh of routers. Later, if we expand, a spine layer of routers will be added. Management functions are provided by two separate management networks implemented as VLANs. The VLANs are tagged through the managment switch into the core network, with their IP gateways terminated on the core or directly on a jumphost. 2.1.4 Manuals and documentation Links to external hardware documentation Dell Force10 S55 Dell Force10 S4810 Dell PowerEdge R630 Dell PowerEdge R730xd Dell RACADM idrac8 RACADM Command Line Interface idrac8 RACADM Command Line Interface Reference Guide Dell idrac Direct Methods to Save and Restore PowerEdge Server Configuration Settings and Firmware Images Importing Server Configuration Profile From USB Device 10 Chapter 2. Installation

Creating and Managing Server Configuration Profiles 2.2 IP Addressing Plan We have 3 different subnets: x.x.x.x/27 oob x.x.x.x/27 mgmt x.x.x.x/24 cloud-public-addresses 2.2.1 mgmt ip allocation x.x.x.1 gw x.x.x.2 gw x.x.x.3 gw x.x.x.4 login-1 x.x.x.5 foreman-1 x.x.x.6 empty x.x.x.7 leaf-1 x.x.x.8 leaf-2 x.x.x.9 leaf-3 x.x.x.10 leaf-4 x.x.x.11 controller-1 x.x.x.12 controller-2 x.x.x.13 controller-3 x.x.x.14 compute-1 x.x.x.15 compute-2 x.x.x.16 compute-3 x.x.x.17 osd-1 x.x.x.18 osd-2 x.x.x.19 osd-3 x.x.x.20 osd-4 x.x.x.21 osd-5 2.2.2 cloud address ip allocation ### x.x.x.x/24 reservert for uh-sky # x.x.x.0/29 reservert nett-loopback x.x.x.0/32 - ledig 172.16.0.1/32 fd00:0::1/128 leaf1 172.16.0.2/32 fd00:0::2/128 leaf2 x.x.x.3/32 leaf3 x.x.x.4/32 leaf4 x.x.x.5/32 - ledig x.x.x.6/32 - ledig x.x.x.7/32 - ledig # x.x.x.8/29 - ledig # x.x.x.16/28 - ledig # x.x.x.32/27 reservert nett-p2p 172.16.1.0/24 fd00:1::0/64 leaf1 - leaf2 x.x.x.36/30 leaf3 - leaf4 x.x.x.40/30 leaf2 - leaf3 x.x.x.44/30 - ledig 2.2. IP Addressing Plan 11

x.x.x.48/30 - ledig x.x.x.52/30 - ledig x.x.x.56/30 - ledig x.x.x.60/30 - ledig # x.x.x.64/26 - ledig # x.x.x.128/25 reservert host-nett 172.16.100.0/24 fd00:100::0/64 host-nett for all fysiske noder 172.16.100.1 leaf1 172.16.100.2 leaf2 172.16.100.3 leaf3 172.16.100.4 leaf4 172.16.100.5 controller1 172.16.100.6 controller2 172.16.100.7 controller3 172.16.100.8 compute1 172.16.100.9 compute2 172.16.100.10 compute3 172.16.100.11 storage1 172.16.100.12 storage2 172.16.100.13 storage3 172.16.100.14 storage4 172.16.100.15 storage5 # below is historic x.x.x.128/29 controller 1 x.x.x.136/29 controller 2 x.x.x.144/29 controller 3 x.x.x.152/29 compute 1 x.x.x.160/29 compute 2 x.x.x.168/29 compute 3 x.x.x.176/29 storage 1 x.x.x.184/29 storage 2 x.x.x.192/29 storage 3 x.x.x.200/29 storage 4 x.x.x.208/29 storage 5 x.x.x.216/29 - ledig x.x.x.224/29 - ledig x.x.x.232/29 - ledig x.x.x.240/29 - ledig x.x.x.248/29 - ledig All boxes, including network equipment, have a mgmt interface and an oob interface on two separate networks in addition to the cloud public network. 12 Chapter 2. Installation

CHAPTER 3 Design High-level documents describing the IaaS platform design 3.1 Development hardware requirements A key point is that each location is built from the same hardware specification. This is done to simplify and limit influence of external variables as much as possible while building the base platform. The spec represents a minimal baseline for one site/location. 3.1.1 Networking 4x Layer 3 routers/switches Connected as routed leaf-spine fabric (OSPF) At least 48 ports 10gb SFP+ / 4 ports 40gb QSFP Support for ONIE/OCP preferred 1x L2 management switch 48 ports 1GbE, VLAN capable Remote management possible Cabling and optics 48x 10GBase-SR SFP+ tranceivers 8x 40GBase-SR4 QSFP+ tranceivers 4x QSFP+ to QSFP+, 40GbE passive copper direct attach cable, 0.5 meter 4x 3 or 5 meter QSFP+ to QSFP+ OM3 MTP fiber cable 3.1.2 Servers 3x management nodes 1u 1x12 core with 128gb RAM 2x SFP+ 10gb and 2x 1gbE 13

2x SSD drives RAID1 Room for more disks Redundant PSUs 3x compute nodes 1u 2x12 core with 512Gb RAM 2x SFP+ 10Gb and 2x 1GbE 2x SSD drives RAID1 Room for more disks Redundant PSUs 5x storage nodes 2u 1x12 core with 128gb RAM 2x SFP+ 10Gb and 2x 1GbE 8x 3.5 2tb SATA drives 4x 120gb SSD drives No RAID, only JBOD Room for more disks (12x 3.5?) Redundant PSUs Comments Management and compute nodes could very well be the same chassis with different specs. Possibly even higher density like half width would be considered, but not blade chassis (it would mean non-standard cabling/connectivity) Important key attribute for SSD drives is sequential write performance. SSDs might be PCIe connected. 2tb disks for storage nodes to speed up recovery times with Ceph 14 Chapter 3. Design

CHAPTER 4 Development 15

16 Chapter 4. Development

CHAPTER 5 Howtos and guides This is a collection of howtos and documentation bits with relevance to the project. 5.1 Build docs locally using Sphinx This describes how to build the documentation from norcams/iaas locally 5.1.1 RHEL, CentOS, Fedora You ll need the python-virtualenvwrapper package from EPEL sudo yum -y install python-virtualenvwrapper # Restart shell exit # Make a virtual Python environment # This env is placed in.virtualenv in $HOME mkvirtualenv docs # activate the docs virtualenv workon docs # install sphinx into it pip install sphinx sphinx_rtd_theme # Compile docs cd iaas/docs make html # Open in modern internet browser of choice xdg-open _build/html/index.html # Deactivate the virtualenv deactivate 17

5.2 Git in the real world 5.2.1 Fix and restore a messy branch http://push.cwcon.org/learn/stay-updated#oops_i_was_messing_around_on_ 5.3 Install KVM on CentOS 7 from minimal install See http://mwiki.yyovkov.net/index.php/linux_kvm_on_centos_7 5.4 Configure a Dell S55 FTOS switch from scratch This describes how to build configure a Dell Powerconnect S55 switch as management switch for our iaas from scratch. 5.4.1 Initial config You will need a laptop with serial console cable. Connect the cable to the rs232 port in front of the switch. Open a console to ttyusbx using screen, tmux, putty or other useable software. Then power on the switch. After the switch has booted, you can now enter the enable state: > enable The switch will default to jumpstart mode, trying to get a config from a central repository. We will disable it by typing: # reload-type normal Now we need to provide an ip address, create user with a passord and set enable password in order to provide ssh access: # configure (conf)# interface managementethernet 0/0 (conf-if-ma-0/0)# ip address 10.0.0.2 /32 (conf-if-ma-0/0)# no shutdown (conf-if-ma-0/0)# exit (conf)# management route 0.0.0.0 /0 10.0.0.1 (conf)# username mylocaluser password 0 mysecretpassword (conf)# enable password 0 myverysecret (conf)# exit # write # copy running-config startup-config Now you can ssh to the switch using your new user from a computer with access to the switch s management network. 5.4.2 Configure the switch itself Let s configure the rest! We start by shutting down all ports: 18 Chapter 5. Howtos and guides

> enable # configure (conf)# interface range gigabitethernet 0/0-47 (conf-if-range-gi-0/0-47)# switchport (conf-if-range-gi-0/0-47)# shutdown (conf-if-range-gi-0/0-47)# exit If you want to use a port channel (with LACP) for redundant uplink to core you can create one. If you don t, omit all references to it later in the document: (conf)# interface port-channel 1 (conf-if-po-1)# switchport (conf-if-po-1)# no shutdown (conf-if-po-1)# exit Assign interfaces to the port channel group: (conf)# interface range gigabitethernet 0/42-43 (conf-if-range-gi-0/42-43)# no switchport (conf-if-range-gi-0/42-43)# port-channel-protocol LACP (conf-if-range-gi-0/42-43)# port-channel 1 mode active (conf-if-range-gi-0/42-43)# no shutdown (conf-if-range-gi-0/42-43)# exit Define in-band and out-of-band VLANs: (conf)# interface vlan 201 (conf-if-vl-201)# description "iaas in-band mgmt" (conf-if-vl-201)# no ip address (conf-if-vl-201)# untagged GigabitEthernet 0/22-33,38-41 (conf-if-vl-201)# tagged Port-channel 1 (conf-if-vl-201)# exit (conf)# interface vlan 202 (conf-if-vl-201)# description "iaas out-of-band mgmt" (conf-if-vl-201)# no ip address (conf-if-vl-201)# untagged GigabitEthernet 0/0-10 (conf-if-vl-201)# tagged Port-channel 1 (conf-if-vl-201)# exit (conf)# exit Congratulations! Save the config and happy server provisioning: # write # copy running-config startup-config 5.5 Install cumulus linux on ONIE enabled Dell S4810 The project will be using Dell PowerConnect S4810 switches with ONIE installer enabled by default instead of FTOS. This enables easy installation of cumulus linux to the switches. 5.5.1 Configure dhcpd and http server You will need a running http server with a copy of the cumulus image: 5.5. Install cumulus linux on ONIE enabled Dell S4810 19

# ls /var/www/html CumulusLinux-2.5.0-powerpc.bin onie-installer-powerpc onie-installer-powerpc is a symlink to the bin-file. The symlink is used by ONIE to identify an image to download. Read here about the order ONIE tries to download the install file: http://opencomputeproject.github.io/onie/docs/user-guide/ Now, for the dhcp server to serve out an IP address and URL for ONIE to download from, dhcp option 114 (URL) is used. This example utilizes ISC dhcpd: option default-url = "http://192.168.0.1/onie-installer-powerpc"; This option can be host, group, subnet or system wide. Read more about different dhcp servers and other methods here: https://support.cumulusnetworks.com/hc/en-us/articles/203771426-using-onie-to-install-cumulus-linux When you power up the switch, it will by default be a dhcp client and accept an offered IP address, after which you can ssh to the ONIE installer with user root without password. However, if option 114 is specified, it will download the image and immediatly install it, and then reboot the switch. When the installation is complete, you can ssh to the switch using default cumulus login. 5.6 Configure idrac-settings on Dell 13g servers with USB stick With Dell PowerEdge 13g servers the idrac base management controller can be configured automatically by reading settings from an xml file located on a USB stick. The USB port to be used is labelled with a wrench icon. By default, Dell PE 13g servers will auto apply config in this manner if the default username and/or password is not changed, so typically new servers are prime targets. 5.6.1 Create USB stick and copy files to it You will need a USB stick formatted with fat32 and a directory called: System_Configuration_XML Two files are needed: config.xml control.xml These xml files can be exported from an already configured server, or better still, git cloned from https://github.com/norcams/dell-idracdirect 5.6.2 Apply profile to server idrac Provide power to the server, but do not insert the USB stick just yet. Power on the server, and wait for the POST process to finish. After POST has finished, insert the USB stick to the port in front of the server with the wrench label. If the server provides a display, it will show first importing, then applying. After some odd 10 seconds the server will reboot. You will notice, as all lights will go out. Remove the USB stick and proceed to the next server. 20 Chapter 5. Howtos and guides

5.7 Using vncviewer to access the console We configure the bmc (baseboard management controller) on our servers to enable a VNC server feature. Accessing the console through VNC is easier and faster than using the Java-based console available through the bmc web interface. On CentOS/RedHat/Fedora, install the needed VNC client packages: yum -y install tigervnc tigervnc-server-minimal vncpasswd # -> enter the idrac password and confirm vncviewer -passwd ~/.vnc/passwd 1.2.3.4:5901 The tigervnc-server-minimal package is needed in order to get the vncpasswd utility. This creates a passwd file that is used for providing a password when connecting to the VNC server. The VNC server on the bmc s is listening on port 5901. Only a single connection is allowed by the server. 5.8 How to boostrap a Foreman-instance This document describes the procedure to initialize a new environment from a single login node. The systems to be used are all physically installed (including configuration of BIOS/iDrac) but otherwise untouched. 5.8.1 Prerequisites a functioning login node (with an up-to-date /opt/[himlar repo] hiearchy) the system is configured by Puppet no management-node are installed (controller) hieradata/<loc>/common.yaml is populated with relevant network data all commands run as the admin user (root) (log in using normal login procedure: iaas user from login node, then sudo) 5.8.2 Procedure 1. On the login node: /usr/local/sbin/bootstrap-<loc>-controller-01.sh Note: The error message curl: (33) HTTP server doesn t seem to support byte ranges. Cannot resume. is harmless when the script has been previously run. If so this is just an indication that the files to be fetched are already in place. 2. Boot the relevant physical node For instance by using the web GUI on the idrac or with this command on the login node: idracadm -r <idrac-ip for <loc>-controller-01 to be installed> -u gaussian -p <idrac-pw> serveraction powercycle Note: Make sure the system is configured to PXE boot on first attempt! Important: When the new controller is fully installed, the script started in 1) must be quit if the new system is set to primarly attempt PXE boot, otherwise it will enter an endless installation loop! 3. Log on to the freshly installed controller node 5.7. Using vncviewer to access the console 21

4. run /opt/himlar/provision/puppetrun.sh 5. Punch a hole in the firewall for traffic to port 8000: iptables -I INPUT 1 -p tcp dport 8000 -j ACCEPT 6. run /usr/local/sbin/bootstrap-<loc>-foreman-01.sh (a) virsh list should now report the foreman instance as running (b) The install can be monitored with vncviewer <loc>-controller.01... (or your preferred vnc viewer application)) (c) When the message Guest installation complete... restarting guest. is written to the terminal from where the script was started, the system is installed and ready for use. (d) The new controller node can be logged on to from the login node: ssh iaas@<loc>-foreman-01... 7. When controller node installation is complete the firewall can be restored: iptables -D INPUT 1 8. Sync /opt/repo from login node to foreman node (NB: fix/repair ownership if necessary, should be root:root) 9. Log on to the new foreman system from the login node, optionally check the install log: /root/install.post.log 10. run HIMLAR_CERTNAME=<certname> /opt/himlar/provision/puppetrun.sh This command can be run several times. 11. run /opt/himlar/provision/foreman-settings.sh At this point there should be a working Foreman instance running which can be logged in to through the web GUI (http/https). This system is then running in an libvirt instance on the physical controller node. 5.8.3 Additional steps after Foreman installation It is beneficial to get the controller node registered in Foreman and listed as a compute resource. This way it is possible to install other systems, like the OpenStack master node, in addition to get the Foreman node itself connected to this libvirt resource. 1. On the controller node, run puppet apply test a couple of times 2. In Foreman GUI sign relevant pending certificate requests if any 3. On Foreman node (cli) run /etc/puppet/node.rb push-facts (is this necessary?) 4. In Foreman GUI register a libvirt resource: (a) Infrastructure -> Compute resources (b) New compute resource (c) Name whatever descriptive Provider Libvirt URL qemu+tcp://<loc>-controller-01.iaas.uio.no:16509/system Display type VNC (d) Check the configured connection: Test connection (e) Submit 5. Select the new resource in the GUI and then the Virtual machines tab; the Foreman node should now be automatically registered here. 22 Chapter 5. Howtos and guides

CHAPTER 6 About the project norcams/iaas is an open source effort focused around automating, documenting and delivering all parts of a complete, Openstack-based production-quality infrastructure. This repository is our project handbook. Infrastructure as code and automation first are the main technical driving forces along with a general need for faster and more efficient delivery of standardized, self-provisioned services among IT-departments in the Norwegian academic sector. Development is funded by the participating entities by contributing employees and knowledge into a nationally distributed team of engineers. Project goals are set, changed and validated within a formal project organization where management from all contributing entities are present. This project organization is named UH-sky and is coordinated by UNINETT, the Norwegian NREN organization. 6.1 What is NorCAMS anyways? It is nothing more than a name label, really. Or a lot more, if you choose for it to be. Pretty confusing, right? To provide some background on it, when starting to collaborate between the universities it became apparent that we needed a name of some sort to identify us and what we were trying to do. Key words where technology, collabration and learning to continuosly improve. In order to have something the NorCAMS name was invented and presented at a meetup in Tromsø early 2014. It is a play on words created from the words Norwegian (or it could be Nordic?) and CAMS. CAMS is an acronym (we all love them, right?) coined in 2010 by Damon Edwards and John Willis at the first US based Devopsdays. It stands for Culture, Automation, Measurement and Sharing and has become a mantra for the devops community and concept. NorCAMS is used as an identifier of the open source and collabration aspect of the formal UH-sky IaaS project. It is useful in several ways, possibly mostly as a marker to show our ambition to be truly open. By not using a more offical name we hope to not scare off anyone, thus maybe attracting contributors? Some further references around CAMS and Devops for those interested John Willis, July 16, 2010 What Devops Means to Me (explaining CAMS) Patrick Debios started Devopsdays in 2009 with Devopsdays Ghent James Turnbull, Feb 2010 What DevOps means to me... 23

6.2 People 6.3 Tracking the project Contents Tracking the project Chat room Tasks and progress reporting Core team weekly schedule * Daily status meeting * Weekly planning meeting Project calendar Social sharing platform 6.3.1 Chat room All members of the project are expected to join and follow our chat room while working. The chat room is used for socializing, status updates, informal quick questions and coordinating various group efforts. Commit messages from the most important git repositories we use are announced in the chat room automatically. To start using the chat room connect to a IRC server on the Freenode network and join the #uh-sky room. Remember, everything in the room is logged on the public internet at https://botbot.me/freenode/uh-sky/ 6.3.2 Tasks and progress reporting The project uses a Trello board for tasks and project planning. Core members are expected to add and manage cards directly on the board. Tasks described on the cards should not be too complicated to solve, ideally we want cards to flow through the board each day. If we do this correctly we get a low-cost, low-friction way of reporting progress and status. Divide and conquer seems like a good idea to try for this. If a card stays in the same column for a day, divide it and try to get smaller parts of it to Done! The Goals column is a bit special. This is where we put larger goals and milestones broken out from the project plan. Goals move directly to the Done column once they are reached. The board is public and available at https://trello.com/b/m7td31zu/iaas To be able to comment on a card you ll need a Trello account. Most of the team members use a Google account as their login identity. 6.3.3 Core team weekly schedule This table shows which days the core team members are available. Jan Ivar, Tor and Erlend are working full time. 24 Chapter 6. About the project

Name Monday Tuesday Wednesday Thursday Friday Erlend 1 1 1 1 1 Hans-Henry 1 1 1 0 0 Hege 0 0 1 1 0 Jan Ivar 1 1 1 1 1 Marte 0 0 0.5 1 1 Mikael 0 1 1 1 0 Tor 1 1 1 1 1 Daily status meeting The core team has daily meetings at 09:30 every work day. These are short meetings meant to summarize what has been worked on since yesterday, what will be done today and what blocks progress, if anything. Each team member is expected to speak briefly about their own situation. Daily meetings are held on Goolge Hangouts and published to the project calendar. They are also announced in the chat room a few minutes before they start. Weekly planning meeting The weekly planning meeting where we discuss direction, milestones and general progress. This is the place for any larger topics or issues involving the full team. To schedule a topic for this meeting project members make a card in Trello and label it as Discussion. The weekly status meeting is held on Google Hangouts and published to the project calendar. 6.3.4 Project calendar Meetings and events are published to a public Google calendar. It is possible to read it as a webpage or subscribe to it in ical format. Right now you ll need to use the webpage interface to find the Google Hangouts video links for each event. There is a plan to update the event description field in the ical data with the Hangout URL by using this Python code but it has not been done yet. 6.3.5 Social sharing platform We have been using a NorCAMS Google Plus-community to share links of project related information for a while. Anyone with relevant content is free to use this as a channel. We put up a web redirect to the community page to make it easier to find, it is at http://plus.norcams.org 6.4 Project plan and description 6.4. Project plan and description 25

UH-sky IaaS platform development Project plan and description Descriptive summary * Limitations * Prerequisites Project goals and success criterias * 1. Develop, document and deliver a base IaaS platform * 2. Integration of authentication and authorization * 3. Further develop and verify services to cover traditional workloads * 4. Research and suggest a solution for PaaS * 5. Research and suggest possible SaaS servics * 6. Research and specify a consumer-focused self-service portal Project milestones and scheduling Resources and budgeting Project organization and management * Core development and engineering * Technical steering group * Top-level management and ownership Risks Appendix * 1. Support for the Microsoft Windows operating system * 2. Licensing of instances in the service * 3. Calculating needed capacity for development 6.4.1 Descriptive summary This document describes what the IaaS project will develop and deliver. The project aims to position IaaS as a common building block and vessel for future IT infrastructure and services delivery in the academic sector. The main project activity is developing, documenting and delivering an open source IaaS platform ready for production use by June 15th 2015. Additional activites that expands and builds on top of this platform are described. These activites will need to be researched, discussed and specified in greater detail before they can be put into action. The project plan sets the earliest startup time for these activities to be February/March 2015. The base IaaS platform will deliver these services: Compute Storage in 2 variants Block storage, accessible as virtual disks for compute instances Object storage, accessible over the network as an API Limitations The project will not deliver traditional backup. A common definition of backup state that backup data must be off-site, off-grid (e.g tape). A planned property of the storage system is to be able to select that an instance will be replicated to another location. The additional activites described are dependent on the base IaaS platform. 26 Chapter 6. About the project

Initial success criterias for the additional activities are described but no cost estimates (resources, budget) are given as part of this project plan. Prerequisites To be able to deliver the platform as described, on time, it is a requirement that the project get access to the needed resources At least 3 people must work full-time (100%) with the main project activity No roles less than 50% If split roles are used, alternating blocks of at least 3 days continuous work hours must be with the project The project will need at least 6 months from the Locations complete milestone to delivery of the platform. This means that to deliver on time by 15th of June 2015 procurement of the needed hardware will need to be completed within 2014. If hardware is delayed until 2015, the final delivery date will be delayed the same amount of time, counting from August 15th 2015, as June and July are not counted due to vacations. E.g, if Locations complete is reached in February 2015 final delivery will be 15th of October 2015. 6.4.2 Project goals and success criterias The project will deliver a base IaaS platform to form a buildling block for future IT infrastructure delivery in the academic sector. The project has defined the following activities: 1. Develop, document and deliver a base IaaS platform 2. Integration of authentication and authorization 3. Further develop and verify services to cover traditional workloads 4. Research and suggest a solution for PaaS 5. Research and suggest possible SaaS servics 6. Research and specify a consumer-focused self-service portal Activities 1 and 2 have been passed by the UH-sky steering group in June 2014. To describe the activities a format similar to user stories is used. The stories share a common set of definitions service The base IaaS platform, including all services layered below user A person within the academic sector (with an identity record in FEIDE) given rights to administer instances and services on behalf of a tenant. tenant An organization or unit within the Norwegian academic sector administrator A person given responsibility and access to all the components of the service. This does not extend to access rights to the resources of a tenant. small instance A compute instance defined as 1 vcpu, 4GB RAM, 10GB storage large instance A compute instance defined as 4 vcpu, 16GB RAM, 100GB storage 6.4. Project plan and description 27

1. Develop, document and deliver a base IaaS platform This is the main project activity. The service must deliver capacity for ~750 small instances or ~275 large instanecs with a total of 100tb accessible storage. This capacity should be equally divided across three geo-dispersed sites. The project must deliver a proof-of-concept PaaS solution able to offer three standardized development environments. The project must deliever proof-of-concept operation of at least one common service, in a SaaS-like model. The service must enable and document an expansion of the base platform to include (existing or new) HPC environments and workloads The service must deliver data that can be used for billing tenants. The data delivered must be usable to identify users, organizations and organization units. A user must be able to start an instance immediately after first login. The instance must be available within 60 seconds. A user must be able to create, update and delete instanes in the service from a graphical user interface in a browser, using an API or by using command line tools. A user must be able to select if an instance should have a persistent boot volume or not. A user must be able to assign and use more storage as needed, within a quota. Billing of storage must be per usage, not per quota. A user should be able to place or move an instance geographically across the available locations. The choice should be possible to make according to the users need for redundancy, resilience, geographical distance or other factors. A user should be able to choose that an instance is replicated to other locations automatically, thus potentially increasing protection against service outages. A user must be given the ability to monitor service performance and quality continuously. An administrator must use two-factor authentication for any access to the service for systems management and maintenance purposes. An administrator must be able to expand capacity, plan and execute infrastructure changes and fix errors in all parts of the service by using version-controlled code and automation. This key point should cover all operational tasks like discovery, deployment, maintenance, monitoring and troubleshooting. 2. Integration of authentication and authorization A user must be able to authenticate via FEIDE and be authorized as belonging to a tenant in the service Any FEIDE user passwords should NOT be stored in the service Before the service can be used in a production scenario it is neccessary to integrate central authentication and authorization. Users in the service must be identified as belonging to an organizational entity with correct billing information. This activity must research and document a model and solution that shows how user- and organization data from FEIDE (and other sources) can be integrated to cover the needs of the service. The model must be detailed enough to make it possible to estimate cost and resource constraints for the solution. Limitations in the chosen solution and model must be described. Suggestions and cost estimates for more advanced id/authn/authz models, e.g users and billing across organizational boundaries, must be discussed. An analysis and assessment of integration with the UNINETT project FEIDE Connect should be done as part of this. 28 Chapter 6. About the project