Deeply analyze SQL and NoSQL databases and master mainstream databases [20000 word analysis]

Hello, I'm spicy.

Have you enjoyed the National Day? If you have a good time, you can deduct 1 from your brother Meng's comments and 2 from your brother Meng's comments. I hope to see everyone deduct 1 yo. If you think this article is helpful to you, you can give a hot article for three consecutive times.

catalogue

Database advanced features

1, Storage engine

Selection of storage engine

Use engine

2, Relationships and foreign keys

relationship

Foreign key

Database transactions and others

1, Business

3. Grammar and usage

4. Examples

2, Stored procedure

grammar

use

3, Python operation

4, sql injection

5, Data backup and recovery

Redis and MongoDB

1, NoSQL overview

2, Getting started with Redis

1. Introduction to redis

2. Redis application scenario

3. Redis installation and configuration

4. Redis configuration

5. Redis server and client

6. Redis persistence

7. Use Redis in Python programs

3, MongoDB overview

2. MongoDB installation and configuration

3. Basic concept of mongodb

4. Operate MongoDB through Shell

5. Operate MongoDB in Python program

receive  

Database advanced features


1, Storage engine

Storage engine is the implementation method of how to store data, how to index data, and how to update and query data.

MySQL supports multiple storage engines by default to meet the needs of database applications in different fields. Users can choose to use different storage engines to improve application efficiency and provide flexible storage.


View current storage engine

show variables like '%storage_engine';
show engines;


MySQL common storage engine

  1. InnoDB

    The preferred engine for transactional databases supports transaction security tables (ACID), row locking and foreign keys. InnoDB is the default MySQL engine.

    The main features of InnoDB are:

    1. InnoDB provides MySQL with a transaction security storage engine with the capabilities of commit, rollback and crash recovery.
    2. InnoDB is designed for maximum performance in handling huge amounts of data. Its CPU efficiency is higher than other disk based relational database engines.
    3. InnoDB storage engine has its own buffer pool, which can cache data and indexes in memory.
    4. InnoDB supports foreign key integrity constraints.
    5. InnoDB is used in many large database sites that need high performance
    6. InnoDB supports row level locks
  2. MyISAM

    MyISAM is based on the ISAM storage engine and extends it. It is one of the most commonly used storage engines in Web, data warehousing and other application environments. MyISAM has high insertion and query speed, but it does not support things.

    The main features of MyISAM are:

    1. Large file support is better
    2. When deleting, updating and inserting are mixed, fewer fragments are generated.
    3. The maximum number of indexes per MyISAM table is 64, which can be changed by recompilation. The maximum number of columns per index is 16
    4. The maximum key length is 1000 bytes.
    5. BLOB and TEXT columns can be indexed
    6. NULL is allowed in the column of the index. This value accounts for 0 ~ 1 bytes of each key
    7. All numeric key values are stored in high byte priority to allow a higher index compression
    8. Auto for MyISAM type table_ The increment column is newer than auto of InnoDB type_ Increment is faster
    9. Data files and index files can be placed in different directories
    10. Each character column can have a different character set
    11. Tables with VARCHAR can have fixed or dynamic record lengths
    12. VARCHAR and CHAR columns can be up to 64KB
    13. Only table locks are supported
  1. MEMORY

    The MEMORY storage engine stores the data in the table in MEMORY to provide fast access for querying and referencing other table data.


Selection of storage engine

In general, InnoDB should be selected for those requiring high insertion and concurrency performance, or requiring foreign keys or transaction support,

For scenarios with fewer inserts and more queries, MyISAM is preferred.


Use engine

It is usually added when creating a table

create table abc (
    name char(10)
) engine=MyISAM charset=utf8;
​
create table xyz (
    name char(10)
) engine=InnoDB charset=utf8;


Differences between InnoDB and MyISAM in terms of files

  1. InnoDB stores a table as two files

    • Demo.frm - > structure and index of storage table
    • Demo.ibd - > store data. IBD storage is limited. If the storage is insufficient, IBD1 and IBD2 are automatically created
    • InnoDB files are created in the corresponding database and cannot be moved arbitrarily
  2. MyISAM stores a table as three files

    • Demo.frm - > structure of storage table
    • Demo.myd - > store data
    • Demo.myi - > index of storage table
    • MyISAM files can be moved at will


2, Relationships and foreign keys


relationship

  • one-on-one

    • There is A record in table A and A unique record in table B
    • For example: student form and grade sheet
  • One to many / many to one

    • There is one record in table A and multiple records in table B always correspond
    • For example: user table and article table in blog
  • Many to many

    • A record in table a corresponds to multiple data in table B. similarly, a data in table B corresponds to multiple data in table a
    • For example: collection table in blog


Foreign key

A foreign key is a constraint. It just ensures the consistency of data and does not bring any benefits to the system performance.

When creating a foreign key, the corresponding index will be established on the foreign key column. The existence of foreign keys will check the constraints during each data insertion and modification. If the foreign key constraints are not met, the data insertion or modification is prohibited, which will inevitably lead to a problem, that is, in the case of a large amount of data, each constraint check will inevitably lead to a decline in performance.

For performance reasons, if our system requires high performance, we can consider not using foreign keys in the production environment.

  1. Construction data

    -- User table
    create table `user` (
        `id` int unsigned primary key auto_increment,
        `name` char(32) not null
    ) charset=utf8;
    ​
    -- Commodity list
    create table `product` (
        `id` int unsigned primary key auto_increment,
        `name` char(32) not null unique,
        `price` float
    ) charset=utf8;
    ​
    -- User information table: one-on-one
    create table `userinfo` (
        `id` int unsigned primary key auto_increment,
        `phone` int unsigned unique,
        `age` int unsigned,
        `location` varchar(128)
    ) charset=utf8;
    ​
    -- User group table: One to many
    create table `group` (
        `id` int unsigned primary key auto_increment,
        `name` char(32) not null unique
    ) charset=utf8;
    ​
    -- Order form: Many to many
    create table `order` (
        `id` int unsigned primary key auto_increment,
        `uid` int unsigned,
        `pid` int unsigned
    ) charset=utf8;

  2. Add foreign key

     
    -- by user and userinfo Establish associated foreign keys
    alter table userinfo add constraint fk_user_id foreign key(id) references user(id);
    ​
    -- Establish foreign key constraints for users and groups
    alter table `user` add `gid` int unsigned;
    alter table `user` add constraint `fk_group_id` foreign key(`gid`) references `group`(`id`);
    ​
    -- Establish foreign key constraints for users, goods and orders
    alter table `order` add constraint `fk_user_id` foreign key(`uid`) references `user`(`id`);
    alter table `order` add constraint `fk_prod_id` foreign key(`pid`) references `product`(`id`);

  3. After trying to insert data, delete the main table and sub table respectively.

  4. Delete foreign keys.

    alter table `Table name` drop foreign key `Foreign key name`

Database transactions and others


1, Business


1. Introduction

Transactions are mainly used to process data with large amount of operations, high complexity and strong correlation.

For example, in the personnel management system, if you delete a person, you need to delete not only the basic data of the person, but also the information related to the person, such as mailbox, articles, etc. in this way, these database operation statements constitute a transaction!

In MySQL, only the Innodb storage engine supports transactions.

Transaction processing can be used to maintain the integrity of the database and ensure that batch SQL statements are either executed or not executed. It is mainly set for insert, update and delete statements.


2. Four characteristics of affairs

In the process of writing or updating data, in order to ensure that the transaction is correct and reliable, it must have four characteristics (ACID):

  1. Atomicity:

    • All operations in a transaction are either completed or not completed, and will not end in an intermediate phase.
    • If an error occurs during the execution of a transaction, it will be rolled back to the state before the start of the transaction, as if the transaction had never been executed.
  2. Consistency:

    Before and after the transaction, the integrity of the database is not destroyed.   This means that the written data must fully comply with all preset rules, including the accuracy and serialization of the data, and the subsequent database can spontaneously complete the predetermined work.

  3. Isolation:

    The degree of interaction between concurrent transactions, such as whether one transaction will read the modified data of another uncommitted transaction. Possible problems during concurrent transaction operations are:   Dirty read: transaction A modifies A data but fails to commit it. Transaction B reads the update result that transaction A did not commit. If transaction A fails to commit, transaction B reads dirty data.   Non repeatable reading: in the same transaction, the results read from the same data are inconsistent. For example, the results read by transaction B before the commit of transaction A may be different from those read after the commit. The reason for non repeatable reads is that transactions modify records concurrently. To avoid this situation, the simplest way is to lock the records to be modified, which will aggravate lock competition and affect performance. Another method is to avoid non repeatable reading without lock through MVCC.   Unreal reading: in the same transaction, the results returned by the same query are inconsistent multiple times. Transaction A adds A new record. Transaction B performs A query operation before and after transaction A is committed. It is found that there is one more record in the latter one than in the previous one. Unreal reading is caused by the increase of records in concurrent transactions. This cannot be solved by locking records like unrepeatable reading, because the newly added records cannot be locked at all. Transactions need to be serialized to avoid unreal reading.

    The isolation levels of transactions from low to high are:

    1. Read uncommitted

      • All transactions can see the execution results of other uncommitted transactions
      • This isolation level is rarely used in practical applications because its performance is not much better than other levels
      • The problem caused by this level is dirty read: uncommitted data is read
    2. Read committed

      • This is the default isolation level for most database systems (but not MySQL)

      • It satisfies the simple definition of isolation: a transaction can only see the changes made by the committed transaction

      • The problems with this isolation level are: nonrepeatable read:

        Non repeatable reading means that we may see different results when we execute exactly the same select statement in the same transaction.

        This may be caused by:

        • A cross transaction has a new commit, which leads to the change of data;
        • When a database is operated by multiple instances, other instances of the same transaction may have new commit ments during the processing of the instance
    3. Repeatable read

      • This is the default transaction isolation level for MySQL
      • It ensures that multiple instances of the same transaction will see the same data row when reading data concurrently
      • Possible problems at this level: phantom read: when a user reads a data row in a range, another transaction inserts a new row in the range. When the user reads a data row in the range again, it will find a new "phantom" row
      • InnoDB solves the unreal reading problem through the multi version concurrency control (MVCC) mechanism;
      • InnoDB also solves the unreal reading problem through gap lock
    4. Serializable

      • This is the highest level of isolation
      • It solves the unreal reading problem by forcing transaction sequencing to make it impossible to conflict with each other. In short, it adds a shared lock to each read data row. MySQL lock summary
      • At this level, a large number of timeouts and lock contentions may result
  4. Durability:

    After the transaction is completed, the data modification is permanent and will not be lost even if the system fails.


3. Grammar and usage

  • Open transaction:   BEGIN or START TRANSACTION

  • COMMIT transaction:   COMMIT, COMMIT will make all changes effective

  • ROLLBACK:   ROLLBACK to undo all uncommitted changes in progress

  • To create a savepoint:   SAVEPOINT identifier

  • Delete savepoint:   RELEASE SAVEPOINT identifier

  • Rollback transaction to savepoint:   ROLLBACK TO identifier

  • Query isolation level of transaction:   show variables like '%isolation%';

  • Set the isolation level of transactions:   SET [SESSION | GLOBAL] TRANSACTION ISOLATION LEVEL {READ UNCOMMITTED | READ COMMITTED | REPEATABLE READ | SERIALIZABLE}

    The isolation levels provided by InnoDB are

    • READ
    • UNCOMMITTED
    • READ COMMITTED
    • REPEATABLE READ
    • SERIALIZABLE


4. Examples


create table `abc` (
    id int unsigned primary key auto_increment,
    name varchar(32) unique,
    age int unsigned
) charset=utf8;
​
begin;
insert into abc (name, age) values ('aa', 11);
insert into abc (name, age) values ('bb', 22);
-- View the data in the transaction
-- At the same time, open another window to connect to MySQL Check whether the data is the same
select * from abc;
commit;
​
begin;
insert into abc (name, age) values ('cc', 33);
insert into abc (name, age) values ('dd', 44);
update abc set age=77 where name='aa';
-- View the data in the transaction
select * from abc;
rollback;
​
select * from abc;  -- Check the data after the transaction

 


2, Stored procedure

Stored Procedure is a database object that stores complex programs in a database for external programs to call.

Stored procedure is a set of SQL statements to complete specific functions. It is compiled, created and saved in the database. Users can call and execute it by specifying the name of the stored procedure and giving parameters (when necessary).

The idea of stored procedure is very simple, that is, code encapsulation and reuse at the level of database SQL language.

  1. advantage

    • Stored procedures can be encapsulated and hide complex business logic.
    • Stored procedures can return values and accept parameters.
    • A stored procedure cannot be run using the SELECT instruction because it is a subroutine, unlike a view table, data table, or user-defined function.
    • Stored procedures can be used for data validation, enforcing business logic, etc.
  2. shortcoming

    • Stored procedures are often customized to a specific database because the supported programming languages are different. When switching to the database system of other manufacturers, the original stored procedure needs to be rewritten.
    • The performance tuning and writing of stored procedures are limited by various database systems.


grammar

  1. Statement terminator can be customized:

    There are many SQL statements in the stored procedure. In order to ensure the syntax structure, there must be a semicolon (;) after the SQL statement, but by default, the semicolon means that the client code is sent to the server for execution. The terminator must be changed

    DELIMITER $$
    -- perhaps
    DELIMITER //

  2. Declare stored procedure:

     
    CREATE PROCEDURE demo_in_parameter(IN p_in int)

  3. Stored procedure start and end symbols:

     
    BEGIN .... END

  4. Variable assignment:

    SET @p_in=1

  5. Variable definition:

    DECLARE l_int int unsigned default 4000000;

  6. Create mysql stored procedures and functions:

    create procedure Stored procedure name(parameter)

  7. Stored procedure body:

    create function Store function name(parameter)


use

  1. Simple usage

    -- definition
    -- If there is one in the stored procedure SQL sentence, begin...end Two keywords can be omitted
    create procedure get_info()
    select * from student;
    ​
    -- call
    call get_info();

  2. A little more complicated (Note: it can only be executed in the standard mysql client, which cannot be recognized by mycli)

    delimiter // --Before definition, change the separator to//
    create procedure foo(in uid int)
    begin
    select * from student where `id`=uid;
    update student set `city`='Beijing' where `id`=uid;
    end//
    delimiter ;  -- After definition, you can change the separator back to semicolon
    ​
    call foo(3);

  3. Viewing stored procedures

    show procedure status like "%foo%";
    show create procedure foo;

  4. Delete stored procedure

    drop procedure foo;


3, Python operation

  1. Installation:

     pip install pymysql

  2. use

    import pymysql
    ​
    db = pymysql.connect(host='localhost',
                         user='user',
                         password='passwd',
                         db='db',
                         charset='utf8')
    ​
    try:
        with db.cursor() as cursor:
            # insert
            sql = "INSERT INTO `users` (`email`, `password`) VALUES (%s, %s)"
            cursor.execute(sql, ('webmaster@python.org', 'very-secret'))
        # Manual submission is required for execution
        db.commit()
    ​
        with db.cursor() as cursor:
            # Read record
            sql = "SELECT `id`, `password` FROM `users` WHERE `email`=%s"
            cursor.execute(sql, ('webmaster@python.org',))
            result = cursor.fetchone()
            print(result)
    finally:
        db.close()


4, sql injection

What is SQL injection?

SQL injection (SQLi) is an injection attack that can execute malicious SQL statements. By inserting arbitrary SQL code into the database query, it enables the attacker to fully control the database server behind the Web application. An attacker can use SQL injection vulnerabilities to bypass application security measures; You can bypass the authentication and authorization of Web pages or Web applications, and retrieve the contents of the entire SQL database; SQL injection can also be used to add, modify and delete records in the database.

Think: is there a problem with the following code?

import pymysql
db = pymysql.connect(host='localhost',user='root',password='abcd1234',db='test',charset='utf8')
​
name=input('enter one user name:')
password=input('Please input a password:')
try:
    with db.cursor() as cursor:
        sql = 'select * from user where name="%s" and password="%s"' %(name,password)
        print(sql)
        cursor.execute(sql)
        print(cursor.fetchone())
    db.commit()
finally:
    db.close()


5, Data backup and recovery

  1. backups

    mysqldump -h localhost -u root -p dbname > dbname.sql

  2. recovery

    mysql -h localhost -u root -p123456 dbname < ./dbname.sql

Redis and MongoDB


1, NoSQL overview

Nowadays, most computer systems (including servers, PC s, mobile devices, etc.) will produce a huge amount of data. In fact, as early as 2012, the amount of data generated every day in the world reached 2.5EB (AI bytes). A large part of these data is stored and managed by relational database. Practice has proved that relational database is the most important way to realize data persistence, and it is also the preferred technology for most applications when choosing persistence schemes.

NoSQL is a new revolutionary database movement. Although its history can be traced back to 1998, NoSQL is really popular and widely used. After entering big data, the industry generally believes that NoSQL is a technical scheme more suitable for big data storage, which makes the development of NoSQL reach an unprecedented height. In 2012, the New York Times wrote in a column that the era of big data has come. In business, economy and other fields, decisions will no longer be based on experience and intuition, but on data and analysis. In fact, in astronomy, meteorology, genomics, biology, sociology, Internet search engines, finance, medical treatment, social networks, e-commerce and many other fields, due to the too dense and huge data, there are unprecedented restrictions and obstacles in data analysis and processing, All this has promoted the research on big data processing technology to a new level, and also brought various NoSQL technical solutions into the public view.

NoSQL databases can be roughly divided into the following categories according to their storage types:

typePartial representativecharacteristic
Column family databaseHBase
Cassandra
Hypertable
As the name suggests, data is stored by column. The biggest feature is to facilitate the storage of structured and semi-structured data and data compression. It has great I/O advantages for the query of a column or several columns, and is suitable for batch data processing and real-time query.
Document databaseMongoDB
CouchDB
ElasticSearch
Document databases generally store data in JSON like format, and the stored content is document type. In this way, there is an opportunity to index some fields and realize some functions of relational database, but it does not provide support for reference integrity and distributed transactions.
KV databaseDynamoDB
Redis
LevelDB
You can quickly query its value through key. There are two implementation schemes: memory based and disk based.
Graph databaseNeo4J
FlockDB
JanusGraph
A database that uses graph structure for semantic query. It uses nodes, edges and attributes to represent and store data. From the design of graph database, we can simply and quickly retrieve the complex hierarchy that is difficult to model in relational system.
object databasedb4o
Versant
Operate the database through syntax similar to object-oriented language and access data through object.


2, Getting started with Redis

Redis is a NoSQL database based on key value pairs. It supports multiple data types (string, hash, list, set, ordered set, bitmap, etc.), and can meet the needs of many application scenarios. Redis places data in memory, so its read-write performance is amazing. At the same time, redis also provides a persistence mechanism, which can save the data in memory to the hard disk, and the data will not be lost in case of accidents. In addition, redis also supports key expiration, geographic information operation, publish and subscribe, transaction, pipeline, Lua script extension and other functions. In a word, redis has very powerful functions and performance. If services such as cache and message queue are to be implemented in the project, it can be directly handed over to redis. At present, many famous enterprises and commercial projects at home and abroad use redis, including Twitter, Github, StackOverflow, Sina Weibo, Baidu, Youku Tudou, meituan, Xiaomi, vipshop, etc.


1. Introduction to redis

In 2008, a programmer named Salvatore Sanfilippo customized an exclusive database for the LLOOGG project he developed (because the system performance could not be improved no matter how he optimized MySQL before). The result of this work is the initial version of Redis. Later, he put Redis code on the world's largest code hosting platform Github Since then, Redis has attracted the praise and attention of a large number of developers, and then hundreds of people have participated in the development and maintenance of Redis, which makes Redis more and more powerful and better.

Redis is the abbreviation of remote dictionary server. It is a high-performance key value storage system written in ANSI C. compared with other key value storage systems, redis has the following characteristics (and advantages):

  • Redis has extremely high read-write performance and rich features (publish / subscribe, transaction, notification, etc.).
  • Redis supports data persistence (RDB and AOF). It can save the data in memory on disk and can be loaded again for use when restarting.
  • Redis supports a variety of data types, including string, hash, list, set, zset, bitmap, hyperloglog, etc.
  • Redis supports master-slave replication (read-write separation) and sentinel mode (monitor whether the master is down and automatically adjust the configuration).
  • Redis supports distributed clusters and can easily improve the overall performance of the system through horizontal expansion.
  • Redis communicates based on the reliable transmission service provided by TCP, and many programming languages provide redis client support.


2. Redis application scenario

  1. Cache - put the infrequently changed but frequently accessed hot data into the Redis database, which can greatly reduce the pressure of the relational database and improve the response performance of the system.
  2. Leaderboard - many websites have leaderboard function. It is very convenient to construct various leaderboard systems by using the list and ordered collection in Redis.
  3. Commodity spike / vote like - Redis provides support for counting operations. Common spike, like and other functions on the website can be realized by using Redis counters through + 1 or - 1 operations, thus avoiding the update operation using relational data.
  4. Distributed lock - Redis can realize the function of distributed lock across multiple servers (similar to thread lock, but can be shared by multiple threads or processes on multiple machines) to realize a blocking operation.
  5. Message queue - like cache, message queue is an indispensable basic service for a large website. It can realize business decoupling and non real-time business peak shaving, which we will show you in later projects.


3. Redis installation and configuration

You can use the   Official website   Download the Redis source code, decompress and archive it, and then build and install the source code through the make tool.

wget http://101.44.1.120/files/318700000890F623/download.redis.io/releases/redis-5.0.8.tar.gz
tar -zxvf redis-5.0.8.tar.gz
cd redis-5.0.8
sudo make && sudo make install


4. Redis configuration

There is a configuration file named redis.conf in the redis source code directory. We can check this file first:   vim redis.conf

  1. Configure to bind Redis service to the specified IP address and port.

    bind 127.0.0.1
    port 6379

  2. Set background running (running as a daemon)

    daemonize yes

  3. Set the log level. The optional values are: (debug: debug,   verbose: detailed,   notice: notice,   warning:

    loglevel warning

  4. The number of configuration databases is 16 by default

    databases 16

  5. Configure data write rules

    save 900 1     # One key has been modified within 900 seconds (15 minutes) and written to the database once
    save 300 10    # Ten key s have been modified within 300 seconds (5 minutes) and written to the database once
    save 60 10000  # 10000 key s have been modified in 60 seconds (1 minute) and written to the database once

  6. Configure the persistence mechanism of Redis - RDB.

     
    rdbcompression yes   # Compress RDB files
    rdbchecksum yes      # Verify RDB files
    dbfilename dump.rdb  # The file name of the RDB database file
    dir /var/local/redis               # Directory where RDB files are saved

  7. Configure the persistence mechanism of Redis - AOF.

    appendonly no
    appendfilename "appendonly.aof"

  8. Configure the master-slave replication of Redis. The read-write separation can be realized through the master-slave replication.

    # Master-Replica replication. Use replicaof to make a Redis instance a copy of
    # another Redis server. A few things to understand ASAP about Redis replication.
    #
    #   +------------------+      +---------------+
    #   |      Master      | ---> |    Replica    |
    #   | (receive writes) |      |  (exact copy) |
    #   +------------------+      +---------------+
    #
    # 1) Redis replication is asynchronous, but you can configure a master to
    #    stop accepting writes if it appears to be not connected with at least
    #    a given number of replicas.
    # 2) Redis replicas are able to perform a partial resynchronization with the
    #    master if the replication link is lost for a relatively small amount of
    #    time. You may want to configure the replication backlog size (see the next
    #    sections of this file) with a sensible value depending on your needs.
    # 3) Replication is automatic and does not need user intervention. After a
    #    network partition replicas automatically try to reconnect to masters
    #    and resynchronize with them.
    #
    replicaof host IP Address host port

  9. Configure slow queries.

    slowlog-log-slower-than 10000  # An operation of more than 10000 milliseconds is considered a slow query
    slowlog-max-len 128            # Record 128 full queries at most


5. Redis server and client

Next, start the Redis server. The following method will start the Redis service with the specified configuration file.

redis-server redis.conf

Next, use the Redis client to connect to the server.

redis-cli -h localhost -p 6379

Redis has a wealth of data types and many commands to operate these data. The specific contents can be viewed Redis Command Reference , on this website, in addition to Redis's command reference, there are also Redis's detailed documents, including notification, transaction, master-slave replication, persistence, sentry, cluster and so on.

127.0.0.1:6379> set username admin
OK
127.0.0.1:6379> get username
"admin"
127.0.0.1:6379> set password "123456" ex 300
OK
127.0.0.1:6379> get password
"123456"
127.0.0.1:6379> ttl username
(integer) -1
127.0.0.1:6379> ttl password
(integer) 286
127.0.0.1:6379> hset stu1 name hao
(integer) 0
127.0.0.1:6379> hset stu1 age 38
(integer) 1
127.0.0.1:6379> hset stu1 gender male
(integer) 1
127.0.0.1:6379> hgetall stu1
1) "name"
2) "hao"
3) "age"
4) "38"
5) "gender"
6) "male"
127.0.0.1:6379> hvals stu1
1) "hao"
2) "38"
3) "male"
127.0.0.1:6379> hmset stu2 name wang age 18 gender female tel 13566778899
OK
127.0.0.1:6379> hgetall stu2
1) "name"
2) "wang"
3) "age"
4) "18"
5) "gender"
6) "female"
7) "tel"
8) "13566778899"
127.0.0.1:6379> lpush nums 1 2 3 4 5
(integer) 5
127.0.0.1:6379> lrange nums 0 -1
1) "5"
2) "4"
3) "3"
4) "2"
5) "1"
127.0.0.1:6379> lpop nums
"5"
127.0.0.1:6379> lpop nums
"4"
127.0.0.1:6379> rpop nums
"1"
127.0.0.1:6379> rpop nums
"2"
127.0.0.1:6379> sadd fruits apple banana orange apple grape grape
(integer) 4
127.0.0.1:6379> scard fruits
(integer) 4
127.0.0.1:6379> smembers fruits
1) "grape"
2) "orange"
3) "banana"
4) "apple"
127.0.0.1:6379> sismember fruits apple
(integer) 1
127.0.0.1:6379> sismember fruits durian
(integer) 0
127.0.0.1:6379> sadd nums1 1 2 3 4 5
(integer) 5
127.0.0.1:6379> sadd nums2 2 4 6 8
(integer) 4
127.0.0.1:6379> sinter nums1 nums2
1) "2"
2) "4"
127.0.0.1:6379> sunion nums1 nums2
1) "1"
2) "2"
3) "3"
4) "4"
5) "5"
6) "6"
7) "8"
127.0.0.1:6379> sdiff nums1 nums2
1) "1"
2) "3"
3) "5"
127.0.0.1:6379> zadd topsinger 5234 zhangxy 1978 chenyx 2235 zhoujl 3520 xuezq
(integer) 4
127.0.0.1:6379> zrange topsinger 0 -1 withscores
1) "chenyx"
2) "1978"
3) "zhoujl"
4) "2235"
5) "xuezq"
6) "3520"
7) "zhangxy"
8) "5234"
127.0.0.1:6379> zrevrange topsinger 0 -1
1) "zhangxy"
2) "xuezq"
3) "zhoujl"
4) "chenyx"
127.0.0.1:6379> geoadd pois 116.39738549206541 39.90862689286386 tiananmen 116.27172936413572 39.99
135172904494 yiheyuan 117.27766503308104 40.65332064313784 gubeishuizhen
(integer) 3
127.0.0.1:6379> geodist pois tiananmen gubeishuizhen km
"111.5333"
127.0.0.1:6379> geodist pois tiananmen yiheyuan km
"14.1230"
127.0.0.1:6379> georadius pois 116.86499108288572 40.40149669363615 50 km withdist
1) 1) "gubeishuizhen"
   2) "44.7408"


6. Redis persistence

When Redis is running, all data is saved in memory. After the process is completed, the data will be written to the hard disk. When starting, it will read the contents of the hard disk and load all the contents into memory (it will occupy a lot of memory).

There are two forms of Redis persistence: RDB and AOF


6.1 RDB

The default persistence method is to mirror the data in memory and save it to dump.rdb file in binary form. The file is persisted according to the time node of the configuration file.

save 900 1
save 300 10
save 60 10000

Advantages: high speed, direct mirroring of data in memory, small files.

Disadvantages: data may be lost, and data within the two save intervals may be lost.


6.2 AOF

AOF(Append only file) is persistent, and each modified instruction is recorded in appendonly.aof. The configuration file needs to be modified to open the AOF function.

appendfsync always: Every time a new command is appended to aof A persistence is performed when the file is, which is very slow but safe
appendfsync everysec: Persistence is performed once per second, fast enough (and used) rdb Persistence) and only 1 second of data will be lost in case of failure
appendfsync no: Never persistent, leaving the data to the operating system for processing. redis Processing commands is faster but not safe.

Advantages: it is suitable for saving incremental data without data loss.

Disadvantages: large file volume and long recovery time


7. Use Redis in Python programs

You can use pip to install the redis module. The core of the redis module is a class named redis. The object of this class represents a redis client. Through this client, you can send commands to the redis server and obtain the execution results. The above commands we used in the redis client are basically the messages that redis objects can receive, so if you understand the redis commands, you can play redis in Python.

>>> import redis
>>> client = redis.Redis(host='1.2.3.4', port=6379, password='1qaz2wsx')
>>> client.set('username', 'admin')
True
>>> client.hset('student', 'name', 'hao')
1
>>> client.hset('student', 'age', 38)
1
>>> client.keys('*')
[b'username', b'student']
>>> client.get('username')
b'admin'
>>> client.hgetall('student')
{b'name': b'hao', b'age': b'38'}


3, MongoDB overview


1. Introduction to mongodb

MongoDB is a document oriented database management system published in 2009. It is written in C + + language to provide scalable high-performance data storage solutions for Web applications. Although MongoDB is considered as a NoSQL product after classification, it is more like a product between relational database and non relational database. Among non relational databases, it has the richest functions and is most like relational database.

MongoDB stores data as a document. A document is composed of a series of "key value pairs". Its document is similar to JSON objects, but MongoDB performs binary processing on JSON (it can locate key and value faster), so its document storage format is called BSON. About the difference between JSON and BSON, you can see the article on the official website of MongoDB <JSON and BSON>.

At present, MongoDB has provided support for Windows, MacOS, Linux, Solaris and other platforms, as well as drivers for a variety of development languages. Of course, Python is one of them.


2. MongoDB installation and configuration

From MongoDB Official download link Download MongoDB, while Linux and MacOS provide compressed files. You can also use the yum command to install MongoDB server and client directly.

sudo yum install mongodb-server  # Install MongoDB server
sudo yum install mongodb   # Install MongoDB client
sudo mongod -f /etc/mongod.conf # Load the configuration item and start the mongodb server

Note: in the above operation, the export command is to set the PATH environment variable, so you can execute mongod in any PATH to start the MongoDB server. MongoDB's default PATH to save data is the / data/db directory, which should be created in advance. In addition, when using mongod to start the MongoDB server, -- bind_ The IP parameter is used to bind the service to the specified IP address, or the -- port parameter can be used to specify the port. The default port is 27017.


3. Basic concept of mongodb

We illustrate some concepts in MongoDB by comparing with relational database.

SQLMongoDBInterpretation (SQL/MongoDB)
databasedatabaseDatabase / database
tablecollection2D table / set
rowdocumentRecord (line) / document
columnfieldField (column) / field
indexindexIndex / index
table joins---Table join / nested document
primary keyprimary keyPrimary key / primary key (_id field)


4. Operate MongoDB through Shell

After starting the server, you can use the interactive environment to communicate with the server, as shown below.

mongo
  1. View, create, and delete databases.

    > // Show all databases
    > show dbs
    admin   0.000GB
    config  0.000GB
    local   0.000GB
    > // Create and switch to the school database
    > use school
    switched to db school
    > // Delete current database
    > db.dropDatabase()
    { "ok" : 1 }
    >

  2. Create, delete, and view collections.

    > // Create and switch to the school database
    > use school
    switched to db school
    > // Create the colleages collection
    > db.createCollection('colleges')
    { "ok" : 1 }
    > // Create students collection
    > db.createCollection('students')
    { "ok" : 1 }
    > // View all collections
    > show collections
    colleges
    students
    > // Delete the collections
    > db.colleges.drop()
    true
    >

    Note: when inserting a document into MongoDB, if the collection does not exist, it will be automatically created, so you can also create a collection by creating a document in the following way.

  3. CRUD operation of the document.

    > // Inserts a document into the students collection
    > db.students.insert({stuid: 1001, name: 'Zhang San', age: 38})
    WriteResult({ "nInserted" : 1 })
    > // Inserts a document into the students collection
    > db.students.save({stuid: 1002, name: 'Da Chui Wang', tel: '13012345678', gender: 'male'})
    WriteResult({ "nInserted" : 1 })
    > // View all documents
    > db.students.find()
    { "_id" : ObjectId("5b13c72e006ad854460ee70b"), "stuid" : 1001, "name" : "Zhang San", "age" : 38 }
    { "_id" : ObjectId("5b13c790006ad854460ee70c"), "stuid" : 1002, "name" : "Da Chui Wang", "tel" : "13012345678", "gender" : "male" }
    > // Update the document with stuid 1001
    > db.students.update({stuid: 1001}, {'$set': {tel: '13566778899', gender: 'male'}})
    WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })
    > // Insert or update a document with stuid 1003
    > db.students.update({stuid: 1003}, {'$set': {name: 'Bai Yuanfang', tel: '13022223333', gender: 'male'}},  upsert=true)
    WriteResult({
            "nMatched" : 0,
            "nUpserted" : 1,
            "nModified" : 0,
            "_id" : ObjectId("5b13c92dd185894d7283efab")
    })
    > // Query all documents
    > db.students.find().pretty()
    {
            "_id" : ObjectId("5b13c72e006ad854460ee70b"),
            "stuid" : 1001,
            "name" : "Zhang San",
            "age" : 38,
            "gender" : "male",
            "tel" : "13566778899"
    }
    {
            "_id" : ObjectId("5b13c790006ad854460ee70c"),
            "stuid" : 1002,
            "name" : "Da Chui Wang",
            "tel" : "13012345678",
            "gender" : "male"
    }
    {
            "_id" : ObjectId("5b13c92dd185894d7283efab"),
            "stuid" : 1003,
            "gender" : "male",
            "name" : "Bai Yuanfang",
            "tel" : "13022223333"
    }
    > // Query documents with stuid greater than 1001
    > db.students.find({stuid: {'$gt': 1001}}).pretty()
    {
            "_id" : ObjectId("5b13c790006ad854460ee70c"),
            "stuid" : 1002,
            "name" : "Da Chui Wang",
            "tel" : "13012345678",
            "gender" : "male"
    }
    {
            "_id" : ObjectId("5b13c92dd185894d7283efab"),
            "stuid" : 1003,
            "gender" : "male",
            "name" : "Bai Yuanfang",
            "tel" : "13022223333"
    }
    > // When querying documents with stuid greater than 1001, only the name and tel fields are displayed
    > db.students.find({stuid: {'$gt': 1001}}, {_id: 0, name: 1, tel: 1}).pretty()
    { "name" : "Da Chui Wang", "tel" : "13012345678" }
    { "name" : "Bai Yuanfang", "tel" : "13022223333" }
    > // Query the document with name "Zhang San" or tel "13022223333"
    > db.students.find({'$or': [{name: 'Zhang San'}, {tel: '13022223333'}]}, {_id: 0, name: 1, tel: 1}).pretty()
    { "name" : "Zhang San", "tel" : "13566778899" }
    { "name" : "Bai Yuanfang", "tel" : "13022223333" }
    > // Query student documents skip the first document and only check one document
    > db.students.find().skip(1).limit(1).pretty()
    {
            "_id" : ObjectId("5b13c790006ad854460ee70c"),
            "stuid" : 1002,
            "name" : "Da Chui Wang",
            "tel" : "13012345678",
            "gender" : "male"
    }
    > // Sort query results (1 indicates ascending order, - 1 indicates descending order)
    > db.students.find({}, {_id: 0, stuid: 1, name: 1}).sort({stuid: -1})
    { "stuid" : 1003, "name" : "Bai Yuanfang" }
    { "stuid" : 1002, "name" : "Da Chui Wang" }
    { "stuid" : 1001, "name" : "Zhang San" }
    > // Creates an index on one or more of the specified fields
    > db.students.ensureIndex({name: 1})
    {
            "createdCollectionAutomatically" : false,
            "numIndexesBefore" : 1,
            "numIndexesAfter" : 2,
            "ok" : 1
    }
    >

Using MongoDB can easily configure data replication, achieve high data availability and disaster recovery through redundant data, and meet the demand of rapid growth of data volume through data fragmentation. For more operations on MongoDB, please refer to Official documents  , At the same time, I recommend you to read the book written by Kristina Chodorow MongoDB authoritative guide.


5. Operate MongoDB in Python program

You can install pymongo through pip to operate MongoDB.

pip3 install pymongo
python3

>>> from pymongo import MongoClient
>>> client = MongoClient('mongodb://127.0.0.1:27017')
>>> db = client.school
>>> for student in db.students.find():
...     print('Student number:', student['stuid'])
...     print('full name:', student['name'])
...     print('Telephone:', student['tel'])
...
Student number: 1001.0
 full name: Zhang San
 Telephone: 13566778899
 Student number: 1002.0
 full name: Da Chui Wang
 Telephone: 13012345678
 Student number: 1003.0
 full name: Bai Yuanfang
 Telephone: 13022223333
>>> db.students.find().count()
3
>>> db.students.remove()
{'n': 3, 'ok': 1.0}
>>> db.students.find().count()
0
>>> coll = db.students
>>> from pymongo import ASCENDING
>>> coll.create_index([('name', ASCENDING)], unique=True)
'name_1'
>>> coll.insert_one({'stuid': int(1001), 'name': 'Zhang San', 'gender': True})
<pymongo.results.InsertOneResult object at 0x1050cc6c8>
>>> coll.insert_many([{'stuid': int(1002), 'name': 'Da Chui Wang', 'gender': False}, {'stuid': int(1003), 'name': 'Bai Yuanfang', 'gender': True}])
<pymongo.results.InsertManyResult object at 0x1050cc8c8>
>>> for student in coll.find({'gender': True}):
...     print('Student number:', student['stuid'])
...     print('full name:', student['name'])
...     print('Gender:', 'male' if student['gender'] else 'female')
...
Student number: 1001
 full name: Zhang San
 Gender: male
 Student number: 1003
 full name: Bai Yuanfang
 Gender: male
>>>

Industry data: add to get PPT template, resume template, industry classic book PDF.
Interview question bank: the classic and hot real interview questions of large factories over the years are continuously updated and added.
Learning materials: including Python, crawler, data analysis, algorithm and other learning videos and documents, which can be added and obtained
Communication plus group: the boss points out the maze. Your problems are often encountered by others. Technical assistance and communication.

receive  

Tags: Database MySQL SQL

Posted on Thu, 07 Oct 2021 00:04:36 -0400 by DESIGNGRAPHY