OpenStack Stein-Train版本新功能介绍

社区目标

每个Release,社区都会定义几个社区目标,期望所有项目能够实现,这有利于对OpenStack下众多的项目能够有统一整体性,而不是各自发展各自的。OpenStack Stein到Train版本,社区定义了如下几个社区目标:

社区很早就开始推动Python2到Python3的升级,在Stein版,将默认的运行模式切换到了Python3,对Python2的支持,将逐渐被废弃。

OpenStack升级一直以来是一个难题,社区在这方面也逐渐采取了很多措施,包括跨版本升级,以及在线升级,在Stein版,社区又制定了升级检查的支持,目的是在升级前,能够运行一个检查命令,检查各项资源是否满足升级的条件。

推动各个项目能够在纯IPv6的环境中部署。

特性概述

这两个版本在功能上比较大的变化,是对资源管理,硬件加速,裸机管理方面的增强和改进,这几个方面相关的项目比较活跃。

  • 资源管理项目Placement独立出来

Placement就是为了管理OpenStack中日益增多的除了CPU,内存等基础资源之外的各种资源而出现的,一开始是孵化在Nova API中的,现在已经比较成熟,独立成一个单独的项目,有自己独立的API和数据库,社区提供了从Nova API迁移到独立的Placement的步骤。

  • 引进硬件加速器项目:cyborg

随着5G,AI需求的增加,在云平台中有效管理各种加速器的需求也逐渐多了起来,该项目提供了硬件和软件加速器的管理框架,软件包括dpdk/spdk, pmem等,硬件包括FPGA, GPU, ARM SoC, NVMe SSD等等,它跟Nova和Placement协调配合,可以将虚拟机调度到符合特定加速器要求的计算节点,并且和cyborg配合,将加速器配置到虚拟机中。

  • 裸机管理

裸机管理的需求也在逐渐增多,因此Ironic项目比较活跃,Ironic中已经支持了很多硬件的驱动,最近又引进了跟IPMI协议类似的一个通用协议Redfish,以能够管理遵循Redfish协议的物理硬件,此外为了让裸机能够支持智能网卡,实现更加高级的网络功能,在这两个项目周期中,Ironic, Nova, Neutron都提供了相应的支持改进。
功能盘点

Cinder

Stein

  • 在RBD driver中,增加了对multiattach和延迟删除(deferred deletion)的功能

Train

  • 在将volume上传到glance中时,在做压缩以及格式转换时,添加了硬件加速的支持,如可利用 Intel QAT 进行加速,可以获得更大的压缩率,同时降低CPU的使用率,减少上传的时间
  • 添加了 volume re-image api,该API是为了配合Nova对boot from volume无法进行rebuild的问题进行改进的,该API提供了原生的将一个volume重新加载进image数据的操作,让nova的rebuild操作变得简单

Glance

  • glance在Rocky引入了multistore的能力,即同一个glance可以对接多个存储后端,类似于cinder的multibackend,这两个release,multistore功能得到进一步加强和稳定,到Train版,已经生产可用。

Ironic

Stein

  • 进一步加强Redfish协议硬件的支持,现在可以通过Redfish协议来Introspection操作,不用IPA的方式,此外还有可以通过Redfish来设置BIOS
  • 引入了Deployment Templates功能,用户可以通过模板的方式对硬件的部署过程进行定制
  • 添加了对智能网卡的支持,可以让物理机的网络配置更加灵活
  • 添加了Allocatoin API,Ironic本身提供了一个根据特定条件,来选择备选机器的API,可以不依赖于外部的调度程序,这对独立使用Ironic提供了方便

Train

  • 提供了对软RAID的配置

Keystone

Stein

  • 提供了一个limits api,用来为其他项目提供quota的全局支持,后续各个项目可以依赖此API,标准化quota的管理
  • 提供了Json Web Token的支持,这是除了Fernet Token之外,引入的第二个Token机制

Train

  • Keystone的API支持角色的分级,默认设置了reader, member, admin角色,主要是引入了reader只读的角色,即系统内置了只读的规则和角色,不需要后续由管理员专门设置
  • Keystone中关键的project, role, domain可以设置为immutable,以防止被误删,影响服务

Kolla

Stein

  • 添加了对Placement的部署支持
  • 添加了对Cyborg的部署支持
  • 支持配置一个独立的迁移网络来进行虚拟机的迁移
  • 支持为nova_libvirt容器配置maximum files 和 processes limits ,默认值1024在使用ceph的情况下太小了
  • 支持Nova, Neutron的滚动升级
  • 对Docker的日志限制了大小,不会无限制增长了

Train

  • 添加了对Masakari的部署支持
  • 支持控制层面的服务运行在纯IPv6的环境中
  • 增加了一个新的命令deploy-containers,可以在没有配置变更的情况下,只更新镜像,加快部署速度
  • 支持为RabbitMQ传递额外的 rabbitmq_server_additional_erl_args 参数
  • 支持为容器挂载额外的Volume,_extra_volumes.
  • 添加了对CentOS 8的支持,包括操作系统和容器镜像,Train版本同时支持CentOS 7和CentOS 8
  • 添加了一个参数 docker_disable_default_network,可以将Docker默认添加的网络和网桥禁用掉

Kuryr

Stein

  • 添加了对Kubernetes Network Policies的支持,通过安全组去实现的
  • 添加了对CRI-O作为Kubernetes CRI时的支持

Train

  • 强化了对Kubernetes Network Policies的支持
  • Kruyr CNI 使用Go重写,以方便部署

Neutron

Stein

  • 添加了对QoS minimum bandwidth Port的支持,可以让Nova通过根据最小带宽的端口进行调度
  • 增加了 Network Segment Range Management 功能,该功能可以让管理员对网络的段管理,比如VLAN ID,VXLAN的VNI进行更加灵活的管理
  • 提升了批量创建端口的性能,这主要是为K8S对接Neutron的网络而改进的

Train

  • 引进了一个新的API : extraroute-atomic,可以原子化的对路由器的路由表进行更新,以防止多个客户端同时更新而引起的竞争问题
  • 为ML2/OVS memchanism driver添加了Smart NIC的支持,可以创建出由智能网卡作为后端的Port
  • 支持对Provider network的segmentation ID进行修改

Nova

Stein

  • 支持使用独立的Placement服务
  • 在创建虚拟机的时候,可以指定volume_type,简化了在特定存储后端创建虚拟机的步骤,在这之前,需要先创建出指定volume type的volume,然后再从该volume创建虚拟机
  • 用户可以用 quality-of-service minimum bandwidth 类型的端口创建虚拟机,该虚拟机会自动调度到该Port所在的计算节点
  • 可配置单台虚拟机可以挂载的最多volume的数量,libvirt driver默认是最多挂载26个卷
  • 现在可以通过placement api或者是nova配置文件来设置内存CPU的超配比
  • 计算节点现在可以将自身的一些特征汇报给placement,在调度时,可以由flavor使用这些特征进行调度
  • 热迁移的改进,在长时间热迁移没有完成的情况下,超过了超时时间,可以选择强制进行迁移,这会将虚拟机Pause,短时间内暂停虚拟机,强制迁移走
  • 重构os-vif的代码,为网卡的offload给物理硬件提供了更加通用的框架

Train

  • 支持为使用了 NUMA topology, pinned CPUs and/or huge pages等特性的虚拟机,进行热迁移
  • 支持为使用SR-IOV Port的虚拟机提供热迁移
  • 支持为挂载了 bandwidth-aware Quality of Service的Port进行冷迁移和resize
  • 为nova-manage命令添加了更多的运维操作,比如归档已经删除的无用记录
  • 提供了对持久化内存的支持,这对一些内存要求较高的应用提供了硬件层面的支持,比如HPC高性能计算,还有像redis, rocksdb这类内存型的数据库
  • 支持 forbidden aggregates 功能,允许用户将一组特定的服务器用于特定的目的,不被其他虚拟机使用和调度
  • 增强了对异构CPU服务器,进行热迁移时的功能,在cpu_mode设置为custom时,可以定义一组cpu_model,跟目的机器适配的,即可进行热迁移

Octavia

Stein

  • 支持选择不同的flavor来创建负载均衡,以前不能选择,是固定的某个flavor
  • 支持在haproxy和后端server之间进行加密传输
  • 添加了一个管理员API,可以查看到每个amphora负载均衡的状态信息

Train

  • 支持配置amphora虚拟机中服务日志的offloading,允许其将日志导出到别的地方
  • 支持创建volume-backend的amphora虚拟机

Oslo

oslo.config

  • oslo.config一个比较大的改进是对后端配置引入了driver的机制,支持将配置存储在某个后端中,而不是限制在配置文件里,在Castellan项目中实现了一个oslo.config的driver,支持将一些明文的密码配置存储在Castellan中

oslo.messaging

  • 在amqp<=2.4.0,并且oslo.messaging配置了TLS功能,会导致消息队列不稳定,见bug 1800957,amqp在2.4.1进行了修复,oslo.messaging现在依赖的最小版本的amqp是2.4.1

Placement

  • Placement在这个release中独立出来,版本号为1.0.0,并且提供了一个文档,Upgrading from Nova to Placement,为迁移升级提供指导,尤其是数据库的迁移,要由之前的nova_api数据库,迁移到placement独立的数据库中
  • 在2.0.0版本,nova需要强制启用placement

参考资料

英文版

cinder

glance

  • rocky
    • An implementation of multiple backend support, which allows operators to configure multiple stores and allows end users to direct image data to a specific store, is introduced as the EXPERIMENTAL Image Service API version 2.8
  • train
    • Glance multi-store feature has been deemed stable

ironic

  • stein
    • Adds additional interfaces for management of hardware including Redfish BIOS settings, explicit iPXE boot interface option, and additional hardware support.
      • Promote iPXE to separate boot interface
      • Out of Band Inspection support for redfish hardware type
    • Increased capabilities and options for operators including deployment templates, improved parallel conductor workers and disk erasure processes, deployed node protection and descriptions, and use of local HTTP(S) servers for serving images.
      • Expose conductor information from API
      • Deploy Templates
    • Smart NIC Networking
    • Allocation API
      • Improved options for standalone users to request allocations of bare metal nodes and submit configuration data as opposed to pre-formed configuration drives. Additionally allows for ironic to be leveraged using JSON-RPC as opposed to an AMQP message bus.
      • For some long time standalone Ironic users have requested an ability to pick a node via API based on some criteria, and reserve it for deployment (== put instance_uuid) on it.
      • Given a resource class and, optionally, a list of required traits, return me an available bare metal node and set instance_uuid on it to make it as reserved.
  • train
    • Basic support for building software RAID

keystone

  • stein
    • The limits API now supports domains in addition to projects, so quota for resources can be allocated to top-level domains and distributed among children projects.
      • Domain Level Unified Limit Support
    • JSON Web Tokens are added as a new token format alongside fernet tokens, enabling support for a internet-standard format. JSON Web Tokens are asymmetrically signed and so synchronizing private keys across keystone servers is no longer required with this token format.
      • Add JSON Web Tokens as a Non-persistent Token Provider
    • Multiple keystone APIs now use default reader, member, and admin roles instead of a catch-all role, which reduces the need for customized policies to create read-only access for certain users.
  • train
    • All keystone APIs now use the default reader, member, and admin roles in their default policies. This means that it is now possible to create a user with finer-grained access to keystone APIs than was previously possible with the default policies. For example, it is possible to create an “auditor” user that can only access keystone’s GET APIs. Please be aware that depending on the default and overridden policies of other OpenStack services, such a user may still be able to access creative or destructive APIs for other services.
    • Keystone roles, projects, and domains may now be made immutable, so that certain important resources like the default roles or service projects cannot be accidentally modified or deleted. This is managed through resource options on roles, projects, and domains. The keystone-manage bootstrap command now allows the deployer to opt into creating the default roles as immutable at deployment time, which will become the default behavior in the future. Roles that existed prior to running keystone-manage bootstrap can be made immutable via resource update.
      • Immutable Resources

kolla

  • stein
    • Added an image and playbooks for the OpenStack Placement service, which has been extracted from Nova into a separate project.
    • Adds support for deploying the OpenStack Cyborg service. Cyborg is a service for managing hardware accelerators.
    • Adds support for a dedicated migration network. This is configured via the variables migration_interface and migration_interface_address.
    • Adds support for using a separate network for Octavia. This is configured via octavia_network_interface and octavia_network_interface_address.
    • Adds support for configuring the maximum files and processes limits in the nova_libvirt container, via the qemu_max_files and qemu_max_processes variables. The default values for these are 32768 and 131072 respectively. This is useful when Nova uses Ceph as a backend, since the default limit of 1024 is often not enough.
    • Implements Neutron rolling upgrade logic, applied for Neutron server, VPNaaS and FWaaS because only these projects have support for rolling upgrade database migration.
    • Implements Nova rolling upgrade logic.
    • Docker logs are no longer allowed to grow unbounded and have been limited to a fixed size per container. Two new variables have been added, docker_log_max_file and docker_log_max_size which default to 5 and 50MB respectively. This means that for each container, there should be no more than 250MB of Docker logs.
  • train
    • Introduced images and playbooks for Masakari, which supports instance High Availability, and Qinling, which provides Functions as a Service.
    • Added support for control plane communication via IPv6.
    • Adds a new kolla-ansible subcommand: deploy-containers. This action will only do the container comparison and deploy out new containers if that comparison detects a change is needed. This should be used to get updated container images, where no new config changes are need, deployed out quickly.
    • It is now possible to pass RABBITMQ_SERVER_ADDITIONAL_ERL_ARGS to RabbitMQ server’s Erlang VM via the newly introduced rabbitmq_server_additional_erl_args variable. See Kolla Ansible docs RabbitMQ section for details.
    • Adds support for configuring additional Docker volumes for Kolla containers. These are configured via _extra_volumes.
    • Adds support for CentOS 8 as a host Operating System and base container image. This is the only major version of CentOS supported from the Ussuri release. The Train release supports both CentOS 7 and 8 hosts, and provides a route for migration.
    • Adds a new flag, docker_disable_default_network, which defaults to no. Docker is using 172.17.0.0/16 by default for bridge networking on docker0, and this might cause routing problems for operator networks. Setting this flag to yes will disable Docker’s bridge networking. This feature will be enabled by default from the Wallaby 12.0.0 release.

kuryr

  • stein
    • Added support for handling and reacting to Network Policies events from kubernetes, allowing Kuryr-Kubernetes to handle security group rules on the fly based on them.
    • Added support for K8s configured to use CRI-O, the Open Container Initiative-based implementation of Kubernetes Container Runtime Interface as container runtime.
  • train
    • Stabilization of the support for Kubernetes Network Policy.
    • Kuryr CNI plugin is now rewriten to golang to make deploying it easier.

neutron

  • stein
    • New framework for neutron-status upgrade check command is added. This framework allows adding various checks which can be run before a Neutron upgrade to ensure if the upgrade can be performed safely. Stadium and 3rd party projects can register their own checks to this new neutron-status CLI tool using entrypoints in neutron.status.upgrade.checks namespace.
    • Support for strict minimum bandwidth based scheduling. With this feature, Nova instances can be scheduled to compute hosts that will honor the minimum bandwidth requirements of the instance as defined by QoS policies of its ports.
    • Network Segment Range Management. This features enables cloud administrators to manage network segment ranges dynamically via a new API extension, as opposed to the previous approach of editing configuration files. This feature targets StarlingX and edge use cases, where ease of of management is paramount.
    • Speed up Neutron port bulk creation. The targets are containers / k8s use cases, where ports are created in groups.
  • train
    • A new API, extraroute-atomic, has been implemented for Neutron routers. This extension enables users to add or delete individual entries to a router routing table, instead of having to update the entire table as one whole. The new API extension extraroute-atomic introduces two new member actions on routers to add/remove routes atomically on the server side. The use of these new member actions (PUT /v2.0/routers/ROUTER-ID/add_extraroutes and PUT /v2.0/routers/ROUTER-ID/remove_extraroutes) is always preferred to the old way (PUT /v2.0/routers/ROUTER-ID) when multiple clients edit the extra routes of a router since the old way is prone to race conditions between concurrent clients and therefore to possible lost updates.
    • Support for L3 conntrack helpers has been added. Users can now configure conntrack helper target rules to be set for a router. This is accomplished by associating a conntrack_helper sub-resource to a router.
    • Add Support for Smart NIC in ML2/OVS mechanism driver, by extending the Neutron OVS mechanism driver and Neutron OVS Agent to bind the Neutron port for the baremetal host with Smart NIC.
    • The segmentation ID of a provider network can be now modified, even with OVS ports bound. Note that, during this process, the traffic of the bound ports tagged with the former segmentation ID (external VLAN) will be mapped to the new one. This can provoke a traffic disruption while the external network VLAN is migrated to the new tag.
      • Change the segment ID of a VLAN provider network

nova

  • stein
    • It is now possible to run Nova with version 1.0.0 of the recently extracted placement service, hosted from its own repository. Note that install/upgrade of an extracted placement service is not yet fully implemented in all deployment tools. Operators should check with their particular deployment tool for support before proceeding. See the placement install and upgrade documentation for more details. In Stein, operators may choose to continue to run with the integrated placement service from the Nova repository, but should begin planning a migration to the extracted placement service by Train, as the removal of the integrated placement code from Nova is planned for the Train release.
    • Users can now specify a volume type when creating servers.
      • Boot instance specific storage backend
      • Currently, when creating a new boot-from-volume instance, the user can only control the type of the volume by pre-creating a bootable image-backed volume with the desired type in cinder and providing it to nova during the boot process. When the user wants to boot the instance on the specified backend, this is not friendly to the user when there are multiple storage backends in the environment.
      • As a user, I would like to specify volume type to my instances when I boot them, especially when I am doing bulk boot. The “bulk boot” means creating multiple servers in separate requests but at the same time.
    • Users can now create servers with Neutron ports that have quality-of-service minimum bandwidth rules.
      • Network Bandwidth resource provider
    • Configure maximum number of volumes to attach
      • Currently, there is a limitation in the libvirt driver restricting the maximum number of volumes to attach to a single instance to 26.
    • Operators can now set overcommit allocation ratios using Nova configuration files or the placement API.
      • Default allocation ratio configuration
    • Compute driver capabilities are now automatically exposed as traits in the placement API so they can be used for scheduling via flavor extra specs and/or image properties.
    • Live-Migration force after timeout
    • Generic os-vif datapath offloads
      • The existing method in os-vif is to pass datapath offload metadata via a VIFPortProfileOVSRepresentor port profile object. This is currently used by the ovs reference plugin and the external agilio_ovs plugin. This spec proposes a refactor of the interface to support more VIF types and offload modes.
  • train
    • Live migration support for servers with a NUMA topology, pinned CPUs and/or huge pages, when using the libvirt compute driver.
      • NUMA-aware live migration
    • Live migration support for servers with SR-IOV ports attached when using the libvirt compute driver.
    • Support for cold migrating and resizing servers with bandwidth-aware Quality of Service ports attached. Cold migration and resize are now supported for servers with neutron ports having resource requests. E.g. ports that have QoS minimum bandwidth rules attached. 
    • Improved operational tooling for things like archiving the database and healing instance resource allocations in Placement.
      • nova-manage db archive_deleted_rows 
    • Support for VPMEM (Virtual Persistent Memory) when using the libvirt compute driver. This provides data persistence across power cycles at a lower cost and with much larger capacities than DRAM, especially benefitting HPC and memory databases such as redis, rocksdb, oracle, SAP HANA, and Aerospike.
      • support virtual persistent memory
    • Train is the first cycle where Placement is available solely from its own project and must be installed separately from Nova.
    • Added support for forbidden aggregates which allows groups of resource providers to only be used for specific purposes, such as reserving a group of compute nodes for licensed workloads.
    • Select CPU model from a list of CPU models

octavia

  • stein
    • Octavia now supports load balancer “flavors”. This allows an operator to create custom load balancer “flavors” that users can select when creating a load balancer.
    • Octavia now supports backend re-encryption of connections to member servers. Backend re-encryption allows users to configure pools to initiate TLS connections to the backend member servers. This enables load balancers to authenticate and encrypt connections from the load balancer to the backend member server.
    • Added new tool octavia-status upgrade check. This framework allows adding various checks which can be run before a Octavia upgrade to ensure if the upgrade can be performed safely.
    • Adds an administrator API to access per-amphora statistics
  • train
    • Octavia now supports Amphora log offloading. Operators can define syslog targets for the Amphora administrative logs and for the tenant load balancer flow logs.
    • Allow creation of volume based amphora. Many deploy production use volume based instances because of more flexibility. Octavia will create volume and attach this to the amphora.

oslo

  • oslo.config
    • Added a Castellan config driver that allows secrets to be moved from on-disk config files to any Castellan-compatible keystore. This driver lives in the Castellan project, so Castellan must be installed in order to use it.
      • Various regulations and best practices say that passwords and other secret values should not be stored in plain text in configuration files. There are “secret store” services to manage values that should be kept secure. Castellan provides an abstraction API for accessing those services. Castellan also depends on oslo.config, which means oslo.config cannot use castellan directly.
      • https://specs.openstack.org/openstack/oslo-specs/specs/queens/oslo-config-drivers.html
  • oslo.messaging
    • In combination with amqp<=2.4.0, oslo.messaging was unreliable when configured with TLS (as is generally recommended). Users would see frequent errors such as this: MessagingTimeout: Timed out waiting for a reply to message ID ae039d1695984addbfaaef032ce4fda3 Such issues would typically lead to downstream service timeouts, with no recourse available other than disabling TLS altogether (see bug 1800957). The underlying issue is fixed in amqp version 2.4.1, which is now the minimum version that oslo.messaging requires.

placement

  • stein
    • The 1.0.0 release of Placement is the first release where the Placement code is hosted in its own repository and managed as its own OpenStack project. Because of this, the majority of changes are not user-facing. There are a small number of new features (including microversion 1.31) and bug fixes, listed below.
    • A new document, Upgrading from Nova to Placement, has been created. It explains the steps required to upgrade to extracted Placement from Nova and to migrate data from the nova_api database to the placement_database.
  • train
    • The 2.0.0 release of placement is the first release where placement is available solely from its own project and must be installed separately from nova. If the extracted placement is not already in use, prior to upgrading to Train, the Stein version of placement must be installed. See Upgrading from Nova to Placement for details.
作者

hackerain

发布于

2021-11-19

更新于

2023-03-11

许可协议