Using RadosGW object storage

http://docs.ceph.org.cn/radosgw/
Object is the basic unit of data storage in the Object Storage system. Each object is a synthesis of data and data attribute set. Data attributes can be set according to application requirements, including data distribution, quality of service, etc. each object maintains its attributes, which simplifies the management task of the storage system. The size of objects can be different, and Object Storage It is a data storage method without hierarchy. It is usually used in cloud computing environment. Unlike other data storage methods, object-based storage does not use directory tree:
Data is stored as a separate object
The data is not placed in the directory hierarchy, but exists at the same level in the flat address space
The application identifies each individual data object by a unique address
Each object can contain metadata that facilitates retrieval
Designed for access at the application level, not the user level, using API s
 
1. Introduction to Rados GW
RADOSGW is an implementation of OSS (object storage service). RADOS gateway, also known as ceph object gateway, RADOSGW and rgw, is a service that enables clients to access ceph clusters using standard object storage APIs. It supports AWS S3 and swift APIs. Rgw runs on librados, After ceph version 0.8, Civetweb's web server is used to respond to API requests. nginx or apache can be used instead. The client communicates with rgw through RESTful API based on http/https protocol, while rgw communicates with ceph cluster using librados. The rgw client uses rgw user authentication through s3 or swift api, The rgw gateway then authenticates on behalf of the user using cephx and ceph storage.
S3 was launched by Amazon in 2006. Its full name is simple storage service. S3 defines object storage and is the de facto standard of object storage. In a sense, S3 is object storage and object storage is S3. It dominates the object storage market. Subsequent object storage is an imitation of S3
2. Object storage features
Data is stored as objects through object storage. Each object contains not only data, but also metadata of the data itself.
Objects are retrieved through the Object ID. objects cannot be accessed directly through file path and file name operations through the ordinary file system. They can only be accessed through the API or a third-party client (actually encapsulating the API).
The objects in the object store are not sorted into the directory tree, but are stored in a flat namespace. Amazon S3 calls this flat namespace bucket, while swift calls it container.
Neither bucket nor container can be nested.
Buckets need to be authorized to access. An account can authorize multiple bucket s with different permissions.
It is convenient to expand horizontally and retrieve data quickly. It does not support client mounting, and the client needs to specify the file name when accessing.
It is not very suitable for scenarios where files are modified and deleted too frequently.
ceph uses a bucket as a storage bucket (storage space) to store object data and isolate multiple users. Data is stored in a bucket, and users' permissions are also authorized for buckets. You can set different permissions for different buckets to realize permission management.
bucket characteristics:
Storage space is the container you use to store objects. All objects must belong to a storage space. You can set and modify storage space attributes to control region, access permission, life cycle, etc. These attribute settings directly affect all objects in the storage space, Therefore, you can flexibly create different storage spaces to complete different management functions.
The interior of the same storage space is flat. There are no concepts such as the directory of the file system. All objects directly belong to their corresponding storage space.
Each user can have multiple storage spaces
The name of a storage space must be globally unique within the OSS. Once created, the name cannot be modified.
There is no limit to the number of objects inside the storage space.
bucket naming conventions:
Only lowercase letters, numbers, and dashes (-) can be included.
Must start and end with a lowercase letter or number.
The length must be between 3-63 bytes  

3. Object storage access comparison:

Amazon S3: user, bucket and object are provided to represent user, bucket and object respectively. Bucket belongs to
For the user, you can set the access permissions of different bucket namespaces for the user, and different users are allowed to access the same namespace
Same bucket.
OpenStack Swift: provides that user, container and object correspond to user, bucket and object respectively, but it
In addition, a parent component account is provided for the user to represent a project or tenant, so an account can be
Contains one or more user s who can share the same set of containers and provide a namespace for the container.
RadosGW: provides user
Subuser, bucket and object, where user corresponds to user of S3 and subuser
It corresponds to Swift's user, but neither user nor subuser supports providing namespaces for bucket s. Therefore, it is different
The user's bucket cannot have the same name; However, since the Jewel version, RadosGW has introduced tenant (tenant) for
Namespace is provided for user and bucket, but it is an optional component. RadosGW is set for different users based on ACL
Different permission controls, such as:
Read plus execute permission
Write write permission
Readwrite read only
Full control all control permissions
4. Deploy radosgw service
cephadmin@ceph-deploy:~$ ceph osd pool ls
device_health_metrics
.rgw.root            #Contains realm (domain information), such as zone and zinegroup
default.rgw.log      #Store log information, which is used to record various log information
default.rgw.control  #The system control pool notifies other RGW S to update the cache when there are data updates
default.rgw.meta     #The metadata storage pool stores different rados objects through different namespaces, including user uid and bucket mapping information namespace users.uid, user key namespace user.keys, user email namespace users.email, user subuser namespace users.swift and bucket namespace root   
root@ceph-mon01-mgr01:~# ps -ef|grep rados
ceph        852      1  0 21:46 ?        00:00:06 /usr/bin/radosgw -f --cluster ceph --name client.rgw.ceph-mon01-mgr01 --setuser ceph --setgroup ceph
cephadmin@ceph-deploy:~$ curl http://192.168.192.172:7480

Change custom port

root@ceph-mon01-mgr01:~# cat /etc/ceph/ceph.conf 
[global]
fsid = d2cca32b-57dc-409f-9605-b19a373ce759
public_network = 192.168.192.0/24
cluster_network = 192.168.227.0/24
mon_initial_members = ceph-mon01-mgr01
mon_host = 192.168.192.172
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx

[client.rgw.ceph-mon01-mgr01] #Add a custom configuration for the current node on the last side as follows: 
rgw_host = ceph-mon01-mgr01
rgw_frontends = civetweb port=8000
systemctl restart ceph-radosgw@rgw.ceph-mon01-mgr01.service

 

default.rgw.bucket.index     # Store the index information from the bucket to the object

default.rgw.bucket.data       # Storing object data

default.rgw.bucket.non-ec    # Additional information storage pool for data

[root@localhost haproxy]# cat haproxy.cfg close firewall SELinux
listen ceph-rgw
  bind 192.168.192.129:80
  mode tcp
  server rgw1 192.168.192.172:8000 check inter 3s fall 3 rise 5
  server rgw2 192.168.192.173:8000 check inter 3s fall 3 rise 5
root@ceph-mon01-mgr01:~# radosgw-admin zone get --rgw-zone=default

Enable ssl

Self signed certificate

cephadmin@ceph-mon01-mgr01:/etc/ceph/certs$ sudo openssl genrsa -out civetweb.key 2048
cephadmin@ceph-mon01-mgr01:/etc/ceph/certs$ touch /home/cephadmin/.rnd
cephadmin@ceph-mon01-mgr01:/etc/ceph/certs$ sudo openssl req -new -x509 -key civetweb.key -out civetweb.crt -subj "/CN=rgw.chuan.net"
root@ceph-mon01-mgr01:/etc/ceph/certs# cat civetweb.key civetweb.crt > civetweb.pem

ssl configuration

root@ceph-mon01-mgr01:/etc/ceph# cat ceph.conf #Add configuration and pem to two rgw nodes
[client.rgw.ceph-mon01-mgr01] #Add a custom configuration for the current node on the last side as follows: 
rgw_host = ceph-mon01-mgr01
rgw_frontends = civetweb port=8000
rgw_frontends = "civetweb port=8000+8443s ssl_certificate=/etc/ceph/certs/civetweb.pem"
root@ceph-mon01-mgr01:/etc/ceph# systemctl restart ceph-radosgw@rgw.ceph-mon01-mgr01.service ceph-radosgw@rgw.ceph-node01.service

 

 

 

 

listen ceph-rgw
  bind 192.168.192.129:80
  mode tcp
  server rgw1 192.168.192.172:8000 check inter 3s fall 3 rise 5
  server rgw2 192.168.192.173:8000 check inter 3s fall 3 rise 5

listen ceph-rgws
  bind 192.168.192.129:8443
  mode tcp
  server rgw1 192.168.192.172:8443 check inter 3s fall 3 rise 5
  server rgw2 192.168.192.173:8443 check inter 3s fall 3 rise 5

 

 

 

root@ceph-node01:/etc/ceph# netstat -anp|grep 8443

tcp 0 0 0.0.0.0:8443 0.0.0.0:* LISTEN 7304/radosgw

Optimized configuration

root@ceph-node01:/etc/ceph# mkdir /var/log/radosgw
root@ceph-node01:/etc/ceph# chown ceph.ceph /var/log/radosgw -R
[client.rgw.ceph-node01] #Add a custom configuration for the current node on the last side as follows: 
rgw_host = ceph-node01
rgw_frontends = civetweb port=8000
rgw_frontends = "civetweb port=8000+8443s ssl_certificate=/etc/ceph/certs/civetweb.pem request_timeout_ms=30000 error_log_file=/var/log/radosgw/civetweb.error.log access_log_file=/var/log/radosgw/civetweb.access.log num_threads=100
systemctl restart ceph-radosgw@rgw.ceph-node01.service

View access log

root@ceph-node01:/var/log/radosgw# cat civetweb.access.log

root@ceph-mon01-mgr01:/var/log/radosgw# tail -f  civetweb.access.log

 

root@ceph-mon01-mgr01:/var/log/radosgw# radosgw-admin  user create --uid="user1" --display-name="chuan user1"
    "keys": [
        {
            "user": ""user1"",
            "access_key": "OBR7FTPC8OA0R4CBA402",
            "secret_key": "9ZxeGE0U6WRtvDTmkoTB0t2UJyNfPtZlN9J5SrsB"
        }
    ],

Install s3cmd client

s3cmd is a command-line client tool that accesses ceph RGW through the command line to create buckets, upload, download and manage data to object storage.

root@ceph-deploy:~# apt install s3cmd

root@ceph-deploy:~# telnet rgw.chuan.net 8443

root@ceph-deploy:~# s3cmd --configure

Enter new values or accept defaults in brackets with Enter.
Refer to user manual for detailed description of all options.

Access key and Secret key are your identifiers for Amazon S3. Leave them empty for using the env variables.
Access Key: OBR7FTPC8OA0R4CBA402  #output
Secret Key: 9ZxeGE0U6WRtvDTmkoTB0t2UJyNfPtZlN9J5SrsB  #input
Default Region [US]:  #enter

Use "s3.amazonaws.com" for S3 Endpoint and not modify it to the target Amazon S3.
S3 Endpoint [s3.amazonaws.com]: rgw.chuan.net:8000 #input     

Use "%(bucket)s.s3.amazonaws.com" to the target Amazon S3. "%(bucket)s" and "%(location)s" vars can be used
if the target S3 system supports dns based buckets.
DNS-style bucket+hostname:port template for accessing a bucket [%(bucket)s.s3.amazonaws.com]: rgw.chuan.net:8000/%(bucket)  #input

Encryption password is used to protect your files from reading
by unauthorized persons while in transfer to S3
Encryption password:  #enter
Path to GPG program [/usr/bin/gpg]: #enter

When using secure HTTPS protocol all communication with Amazon S3
servers is protected from 3rd party eavesdropping. This method is
slower than plain HTTP, and can only be proxied with Python 2.7 or newer
Use HTTPS protocol [Yes]: No  #No https

On some networks all internet access must go through a HTTP proxy.
Try setting it here if you can't connect to S3 directly
HTTP Proxy server name: 

New settings:
  Access Key: OBR7FTPC8OA0R4CBA402
  Secret Key: 9ZxeGE0U6WRtvDTmkoTB0t2UJyNfPtZlN9J5SrsB
  Default Region: US
  S3 Endpoint: rgw.chuan.net:8000
  DNS-style bucket+hostname:port template for accessing a bucket: rgw.chuan.net:8000/%(bucket)
  Encryption password: 
  Path to GPG program: /usr/bin/gpg
  Use HTTPS protocol: False
  HTTP Proxy server name: 
  HTTP Proxy server port: 0

Test access with supplied credentials? [Y/n] Y  #test
Please wait, attempting to list all buckets...
ERROR: Test failed: [Errno 111] Connection refused

Retry configuration? [Y/n] y 
Save settings? [y/N] Y
Configuration saved to '/root/.s3cfg'

 

root@ceph-deploy:~# s3cmd la
root@ceph-deploy:~# s3cmd mb s3://chuan
Bucket 's3://chuan/' created

root@ceph-deploy:~# s3cmd ls   # The environment is OK

root@ceph-deploy:~# cat .s3cfg 
[default]
access_key = OBR7FTPC8OA0R4CBA402
access_token = 
add_encoding_exts = 
add_headers = 
bucket_location = US
ca_certs_file = 
cache_file = 
check_ssl_certificate = True
check_ssl_hostname = True
cloudfront_host = cloudfront.amazonaws.com
default_mime_type = binary/octet-stream
delay_updates = False
delete_after = False
delete_after_fetch = False
delete_removed = False
dry_run = False
enable_multipart = True
encoding = UTF-8
encrypt = False
expiry_date = 
expiry_days = 
expiry_prefix = 
follow_symlinks = False
force = False
get_continue = False
gpg_command = /usr/bin/gpg
gpg_decrypt = %(gpg_command)s -d --verbose --no-use-agent --batch --yes --passphrase-fd %(passphrase_fd)s -o %(output_file)s %(input_file)s
gpg_encrypt = %(gpg_command)s -c --verbose --no-use-agent --batch --yes --passphrase-fd %(passphrase_fd)s -o %(output_file)s %(input_file)s
gpg_passphrase = 
guess_mime_type = True
host_base = rgw.chuan.net:8000
host_bucket = rgw.chuan.net:8000/%(bucket)
human_readable_sizes = False
invalidate_default_index_on_cf = False
invalidate_default_index_root_on_cf = True
invalidate_on_cf = False
kms_key = 
limit = -1
limitrate = 0
list_md5 = False
log_target_prefix = 
long_listing = False
max_delete = -1
mime_type = 
multipart_chunk_size_mb = 15
multipart_max_chunks = 10000
preserve_attrs = True
progress_meter = True
proxy_host = 
proxy_port = 0
put_continue = False
recursive = False
recv_chunk = 65536
reduced_redundancy = False
requester_pays = False
restore_days = 1
restore_priority = Standard
secret_key = 9ZxeGE0U6WRtvDTmkoTB0t2UJyNfPtZlN9J5SrsB
send_chunk = 65536
server_side_encryption = False
signature_v2 = False
signurl_use_https = False
simpledb_host = sdb.amazonaws.com
skip_existing = False
socket_timeout = 300
stats = False
stop_on_error = False
storage_class = 
urlencoding_mode = normal
use_http_expect = False
use_https = False
use_mime_magic = True
verbosity = WARNING
website_endpoint = http://%(bucket)s.s3-website-%(location)s.amazonaws.com/
website_error = 
website_index = index.html

Create bucket authentication permission

A bucket is a container for storing objects. Before uploading objects of any type, create a bucket

root@ceph-deploy:~# s3cmd la
root@ceph-deploy:~# s3cmd mb s3://chuan
Bucket 's3://chuan/' created
root@ceph-deploy:~# s3cmd put /root/docker-binary-install.tar.gz s3://chuan/tar/  #Upload data
upload: '/root/docker-binary-install.tar.gz' -> 's3://chuan/tar/docker-binary-install.tar.gz'  [part 1 of 5, 15MB] [1 of 1]
 15728640 of 15728640   100% in    2s     6.39 MB/s  done
upload: '/root/docker-binary-install.tar.gz' -> 's3://chuan/tar/docker-binary-install.tar.gz'  [part 2 of 5, 15MB] [1 of 1]
 15728640 of 15728640   100% in    0s    60.23 MB/s  done
upload: '/root/docker-binary-install.tar.gz' -> 's3://chuan/tar/docker-binary-install.tar.gz'  [part 3 of 5, 15MB] [1 of 1]
 15728640 of 15728640   100% in    0s    67.17 MB/s  done
upload: '/root/docker-binary-install.tar.gz' -> 's3://chuan/tar/docker-binary-install.tar.gz'  [part 4 of 5, 15MB] [1 of 1]
 15728640 of 15728640   100% in    0s    57.15 MB/s  done
upload: '/root/docker-binary-install.tar.gz' -> 's3://chuan/tar/docker-binary-install.tar.gz'  [part 5 of 5, 14MB] [1 of 1]
 15241880 of 15241880   100% in    0s    69.33 MB/s  done
root@ceph-deploy:~# ceph osd pool ls
device_health_metrics
.rgw.root
default.rgw.log
default.rgw.control
default.rgw.meta
default.rgw.buckets.index
default.rgw.buckets.non-ec
default.rgw.buckets.data

default.rgw.bucket.index     # Store the index information from the bucket to the object

default.rgw.bucket.data       # Storing object data

default.rgw.bucket.non-ec    # Additional information storage pool for data

root@ceph-deploy:~# s3cmd ls s3://chuan/tar/   #see
root@ceph-deploy:~# ceph df
.rgw.root                    4   32  1.3 KiB        4   48 KiB      0    142 GiB
default.rgw.log              5   32  3.6 KiB      209  408 KiB      0    142 GiB
default.rgw.control          6   32      0 B        8      0 B      0    142 GiB
default.rgw.meta             7  128    930 B        5   48 KiB      0    142 GiB
default.rgw.buckets.index    8  128      0 B       11      0 B      0    142 GiB
default.rgw.buckets.non-ec   9   32      0 B        0      0 B      0    142 GiB
default.rgw.buckets.data    10   32   75 MiB       21  224 MiB   0.05    142 GiB
root@ceph-deploy:/opt# s3cmd get s3://chuan/tar/docker-binary-install.tar.gz  #download
s3cmd rm s3://chuan/tar/docker-binary-install.tar.gz

 

root@ceph-deploy:~# ceph pg ls-by-pool  default.rgw.buckets.data|awk '{print $1,$2,$15}'
PG OBJECTS ACTING
10.0 1 [5,8,0]p5
10.1 1 [8,3,0]p8
10.2 0 [1,4,8]p1
10.3 2 [7,2,4]p7
10.4 1 [8,0,3]p8
10.5 1 [2,7,5]p2
10.6 2 [5,8,2]p5
10.7 2 [3,1,6]p3
10.8 3 [0,7,5]p0
10.9 1 [3,2,8]p3
10.a 1 [4,7,2]p4
10.b 1 [5,1,8]p5
10.c 0 [4,8,0]p4
10.d 1 [4,0,7]p4
10.e 1 [4,2,6]p4
10.f 2 [6,3,1]p6
10.10 1 [6,4,0]p6
10.11 0 [7,0,3]p7
10.12 0 [6,5,0]p6
10.13 5 [1,7,3]p1
10.14 1 [1,7,3]p1
10.15 0 [0,5,7]p0
10.16 2 [3,1,6]p3
10.17 2 [3,7,2]p3
10.18 2 [2,6,4]p2
10.19 0 [1,4,7]p1
10.1a 2 [3,8,2]p3
10.1b 1 [2,4,8]p2
10.1c 1 [5,1,8]p5
10.1d 0 [4,8,1]p4
10.1e 4 [6,0,4]p6
10.1f 0 [3,1,8]p3

 

root@ceph-deploy:~# ceph osd pool get default.rgw.buckets.data size
size: 3
root@ceph-deploy:~# ceph osd pool get default.rgw.buckets.data crush_rule
crush_rule: replicated_rule

 

root@ceph-deploy:~# ceph osd pool get default.rgw.buckets.data pg_num
pg_num: 32
root@ceph-deploy:~# ceph osd pool get default.rgw.buckets.data pgp_num
pgp_num: 32

 

s3cmd --recursive ls s3://chuan

s3cmd sync --delete-removed ./ s3://chuan --force

s3cmd rb s3://chuan

Tags: Ceph

Posted on Sun, 28 Nov 2021 15:28:43 -0500 by journy101