Python -- network programming and WebServer

Network programming

  • Review: basic Linux operations
    • ctrl A to the beginning of the command line
    • ctrl E to the end of the command line
    • ifconfig view network status
    • mv file rename
    • cp copy files to
  • vim basic operation
    • In command mode:
      • Jump directly to a line: line number + G
      • Copy the line where the cursor is located and paste to the next line: yyp
      • Skip to the end of the line and enter edit mode: A
      • Jump to the beginning of the line: I
      • Cut when selected: d
      • Paste: p
      • Move left when selected:<
      • Insert a line before the cursor line: O
      • Insert a row after: o
    • vim xxx.py +4 after opening the file, the cursor is on the fourth line
  • Basic knowledge

socket communication

  • Necessary to complete network communication

     import socket
    # Create socket
     socket.socket(AddressFamily, Type)	 
    # Receive / send data using socket
     s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)   # tcp
     s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)	# udp
    # Close socket
     s.close()
    
  • When using sublime programming in Linux, see "consolidation of Linux foundation" for relevant Linux operations, and it is necessary to summarize the "frequent interview sites"

  • There are two common communication protocols: TCP and UDP

  • UDP Socket sending data:

    # To send data, you must determine the opposite port
    
    from socket import *
    
    # 1. Create udp socket
    udp_socket = socket(AF_INET, SOCK_DGRAM)
    
    while True:
        if send_data == "exit":
            break
        # 2. Prepare the address of the receiver
        # '192.168.1.103' indicates the destination ip address
        # 8080 indicates the destination port
        dest_addr = ('192.168.1.103', 8080)  # Note that it is a tuple, ip is a string, and port is a number
    
        # 3. Get data from the keyboard (prepare data)
        send_data = input("Please enter the data to send:")
    
        # 4. Send data
        udp_socket.sendto(send_data.encode('utf-8'), dest_addr)
    
    # 5. Close the socket
    udp_socket.close()
    
    # To run the program under Ubuntu: 	 roy@ubuntu:$ python3 xxx.py 
    
  • ping before sending to see if the network is connected; Change the virtual machine to the bridging mode. If the IP is still not in the same network segment, use the command sudo dhclient and wait quietly;

  • Error prompt TypeError: need is object not str means don't send string; Solution: write a b in front of the string, that is, program byte object; Or, as shown in the code, encode

  • In ubuntu, both Python 3 and IPython 3 are interactive modes, and the latter is similar to Jupiter;

  • UDP receive data:

    # To receive data, you must determine your own port and bind it. This IP must also be your own (you can't write it)
    # In short, it's all right to determine the port and IP
    
    # The program is not written from the first line
    
    from socket import *
    
    def main():
        # 1. Create socket
        udp_socket = socket(AF_INET, SOCK_DGRAM)
        # 2. Binding port (local information)
        local_addr = ('', 7788)
        udp_socket.bind(local_addr)
        # 3. Receive data
        recv_data = udp_socket.recvfrom(1024)  # 1024 indicates the maximum number of bytes received this time
        	# If no data is received, it will be blocked here
        send_addr = recv_data[0]
        recv_msg = recv_data[1]
        # 4. Print the received data
        print("%s:%s"%(str(send_addr), recv_msg.decode("gbk")))	# Data from Windows should be decoded with gbk
        # 5. Close the socket
        udp_socket.close()
    # Write code from here
    if __name__ == "__main__":
        main()
    
  • Summary:

    • The sender and receiver shall find out the necessary parameters respectively. The sender is the other party's (IP, port) and the receiver is the binding port (an application occupies at least one port for communication); Catch it, eh~
    • Why do you say "necessary" parameters? Both the sender and the receiver need their own ports to send data (tcp and udp are end-to-end), but the sender does not have a bind, so the OS will randomly assign a port;
    • Because the port is the docking target of the socket, different programs (ports) on the same computer (IP) can send data to each other. "Sending to each other" means to create a socket, which can be received and sent;
    • Sockets are full duplex;
  • Chat device

    • If the program does not cache messages when receiving data, it flashes by, that is, the OS will temporarily store the received messages; This also has disadvantages, which may cache too much information, occupy memory and cause crash; (since it is a single task now, only half duplex can be realized)
    • Here, the sending IP can write the 127.0.0.1 loopback address and send and receive by itself; You can also view ubuntu's own IP to realize self entertainment;

    TCP communication

  • UDP is not secure, similar to writing letters; TCP is similar to calling. It requires a link and a confirmation mechanism

  • TCP has congestion control and reliable transmission mechanism

    • Congestion control: exponential growth, linear growth, fast start
    • Reliable transmission includes timeout retransmission and error verification
  • TCP is a strict client server model

    • Client: link required
      from socket import *
      
      # Create socket
      tcp_client_socket = socket(AF_INET, SOCK_STREAM)	# TCP
      
      # server information
      server_ip = input("Please enter the server ip:")
      server_port = int(input("Please enter the server port:"))
      
      # Link server
      tcp_client_socket.connect((server_ip, server_port))
      
      # Prompt user for data
      send_data = input("Please enter the data to send:")
      
      # First (request)
      tcp_client_socket.send(send_data.encode("gbk"))
      
      # Receive the data sent by the other party, with a maximum of 1024 bytes
      recvData = tcp_client_socket.recv(1024)
      print('The data received is:', recvData.decode('gbk'))
      
      # Close socket
      tcp_client_socket.close()
      
    • Server: bind and run again

      from socket import *
      
      # Create socket
      tcp_server_socket = socket(AF_INET, SOCK_STREAM)
      
      # Local information
      address = ('', 7788)	# tuple
      # binding
      # Whether to bind or not depends on whether to receive data. If you only send, it's OK not to bind
      tcp_server_socket.bind(address)
      
      # Sockets created with socket s are active by default,
      # Use listen to make it passive, receive other people's links, and socket connect
      tcp_server_socket.listen(128)	# The listening socket is responsible for waiting for a new customer link
      
      # accept() is responsible for generating a new socket client_socket is dedicated to this client
      # clientAddr is "caller ID"
      client_socket, clientAddr = tcp_server_socket.accept()	# The default is blocking, waiting for the client to connect()
      
      # Receive the data sent by the other party (receive first)
      recv_data = client_socket.recv(1024)  # The function is written differently from UDP
      print('The data received is:', recv_data.decode('gbk'))
      
      # Send some data to the client
      client_socket.send("thank you !".encode('gbk'))
      
      # Close the socket serving this client. Once closed, it means that you can no longer serve this client. If you still need services, you can only reconnect again
      client_socket.close()
      
      tcp_server_socket.close()
      

      The difference is that the server side will generate two sockets, which are used to monitor and send and receive data respectively;
      The server should receive data (response) first, and the client should send data (request) first; Then send and receive each other;

  • Main differences between and UDP:

    • Strict client server model, separation
    • Need a link, not a simple notification (IP + port)

TCP file downloader

  • client

    from socket import *
    
    def main():
        # Create socket
        tcp_client_socket = socket(AF_INET, SOCK_STREAM)
    
        # Purpose information
        server_ip = input("Please enter the server ip:")	# This can be tied
        server_port = int(input("Please enter the server port:"))
    
        # Link server
        tcp_client_socket.connect((server_ip, server_port))
    
        # Enter the file name to download
        file_name = input("Please enter the file name to download:")
    
        # Send a file download request (first establish a connection, send a request, and then receive!)
        tcp_client_socket.send(file_name.encode("utf-8"))
    
        # Receive the data sent by the other party, with a maximum of 1024 bytes (1K)
        recv_data = tcp_client_socket.recv(1024)
        # print('received data is: ', recv_data.decode('utf-8'))
        # If the data is received, the file will be created again, otherwise it will not be created
        if recv_data:
            # Exceptions may occur during reading and writing and need to be captured. The function of with is not to manually capture and close
            with open("[receive]"+file_name, "wb") as f:
                f.write(recv_data)
                # with is generally used in 'w' mode. To read a file, add try...except... Capture
    
        # Close socket
        tcp_client_socket.close()
        
    if __name__ == "__main__":
        main()
    
  • Server:

    from socket import *
    import sys
    
    def get_file_content(file_name):
        """Get the contents of the file"""	# Function Comments 
        try:	# The file may not exist. This is the standard way to write and read files
            with open(file_name, "rb") as f:
                content = f.read()
            return content
        except:
            print("No files downloaded:%s" % file_name)
    
    # Run the program to start the server. The input parameters: [0] are the file name of the program and [1] is the port number
    def main():
        # sys.argv[]In fact, it is a list. The items in it are parameters entered by the user, and the parameters are input from outside the program, for example: python3 test.py a b c		# abc is an external input parameter, which is received in a list
        if len(sys.argv) != 2:
            # Ensure that the port parameters are entered
            print("Please run as follows: python3 xxx.py 7890")
            return
        else:
            # The operation mode is python3 xxx.py 7890
            port = int(sys.argv[1])
    
        # Create socket
        tcp_server_socket = socket(AF_INET, SOCK_STREAM)
        # Local information
        address = ('', port)
        # Bind local information (need to receive client information)
        tcp_server_socket.bind(address)
        # Change active socket to passive socket
        tcp_server_socket.listen(128)	# Determining how many client connections can be made involves high concurrency
    
        while True:	# The server continues to run
            # Wait for the client's link to send the file for this client
            client_socket, clientAddr = tcp_server_socket.accept()  # block
            # Receive data sent by the other party
            recv_data = client_socket.recv(1024)  # Receive 1024 bytes; block
            file_name = recv_data.decode("utf-8")
            print("The file name requested by the other party is:%s" % file_name)
            file_content = get_file_content(file_name)
            # Send file data to client
            # Because the open file is opened in rb mode when it is obtained, encoding is not required
            if file_content:
                client_socket.send(file_content)
            # Close this socket
            client_socket.close()
    
        # Close listening socket
        tcp_server_socket.close()
    
    if __name__ == "__main__":
        main()
    
  • TCP notes:

  • Close the listening socket. The accept socket that has established a connection will not be disconnected;

  • There are two ways to unblock the recv socket on the server: the client closes (hangs up) and the server receives data

  • Combined with multitasking, it can serve multiple users

Web server

HTTP protocol

  • Transmission protocol between browser and server; What is an agreement? Specifications for services
  • Question: enter in Chrome http://www.baidu.com What will happen?

    • Classic questions, you can search for information, can answer simple, can answer detailed, very test the level
  • GET means to request data from the server, and POST means to submit data to the server

Chrome

  • Use the chrome - > check function, where the Network will record the whole interaction process

    • Here, messages are chunked and used for long connections, which is convenient to determine when to disconnect
  • To see the data requested by the browser and returned by the server, click view source

    // Server echo information
    HTTP/1.1 200 OK
    Cache-Control: no-cache
    Connection: keep-alive
    Content-Encoding: gzip
    Content-Type: text/html;charset=utf-8
    Coremonitorno: 0
    Date: Fri, 08 Jan 2021 09:05:20 GMT
    Server: apache
    Set-Cookie: H_WISE_SIDS=107320_110085_127969_128698_131424_144966_151532_154619_155932_156289_156849_160573_161395_161840_162152_162156_163233_163321_163567_163805_163837_164020_164108_164163_164219_164697_164940_164955_164963_165070_165087_165236_165328_165523_165552_165564_165652_165698_165716_165736_165813_165848_166056_166143_166147_166167_166177_166180_166184_166209_166277_166282_166312_166449_166852_167107; path=/; expires=Sat, 08-Jan-22 09:05:20 GMT; domain=.baidu.com
    Set-Cookie: bd_traffictrace=081705; expires=Thu, 08-Jan-1970 00:00:00 GMT
    Set-Cookie: rsv_i=0d3ejV%2BeAsqCMFZjR8yzjtJZF3%2Fo6FYGIBXa%2F50wTW7hXSVtbGWWSXntA4sywlJiaUx1y4ZjyUXUJDkXU%2Bl4cy9fMIfE8Rs; path=/; domain=.baidu.com
    Set-Cookie: BDSVRTM=568; path=/
    Set-Cookie: eqid=deleted; path=/; domain=.baidu.com; expires=Thu, 01 Jan 1970 00:00:00 GMT
    Set-Cookie: __bsi=; max-age=3600; domain=m.baidu.com; path=/
    Strict-Transport-Security: max-age=172800
    Traceid: 161009672003635589229344073941350584633
    Vary: Accept-Encoding
    Transfer-Encoding: chunked
    
    // Browser echo message
    0 GET /index.html HTTP/1.1
    1 Host: 127.0.0.1:7890
    2 User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:84.0) Gecko/20100101 Firefox/84.0
    3 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
    4 Accept-Language: zh-CN,zh;q=0.8,zh-TW;q=0.7,zh-HK;q=0.5,en-US;q=0.3,en;q=0.2
    5 Accept-Encoding: gzip, deflate
    6 Connection: keep-alive
    7 Upgrade-Insecure-Requests: 1
    

Cookie

  • Store user browsing records and user portraits
  • The server can send the cookie value to the browser and save it, set cookie, and judge whether the user logs in successfully or not;

Simple implementation

Analysis requirements

  • TCP server
  • Simply send back a simple message to the browser

Code verification

  • Start the following program (as a server)
  • Enter in the browser (client): https://127.0.0.1:7890
  • The received request from the client is the content in the view source
  • To wrap a line here, the server must always bind the port first

TCP handshake and wave

  • Originally, the establishment of a link is also four handshakes, which is similar to four waves in form, but when the server returns, it can directly bring its own seq and let the client return ack=seq+1;

  • Principle: when seq looks at our last time and ack looks at the other party's seq, it is plus 1;

  • The client will not send data after sending FIN signal, but can still accept data; After the client returns the last ACK, both parties will wait for at least one RTT to prevent message loss and retransmission

  • After four waves, the server still occupies its port, which is why the client (close first) generally does not bind the port to avoid occupation;

  • Note: binding and validation are not the same thing

    # Set the server to release resources immediately after four waves to ensure that the next running port is not occupied
    server_socket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
    

Return to page

  • Using courseware 05python and linux Advanced Programming stage / 1-6 code and screenshot / 07 implementation of http protocol and http server-1

  • Put the html folder in the directory where the server program is located

    import socket
    import re
    
    def service_client(new_socket):
        # Receive browser requests
        request = new_socket.recv(1024).decode("utf-8")
        lines = request.splitlines()	# Split into lines for easy parsing
        # print(lines)
        # Parse the request using regular and return
        file_name = ""
        # The first line below should match index.html
        # GET /index.html HTTP/1.1 
        # The slash starts and the space stops
        ret = re.match(r"[^/]+(/[^ ]*)", lines[0])	# *Represents 0 or more, and may not have a file name
    	if ret:	# /index.html
            file_name = ret.group(1)
            if file_name == "/":
                file_name = "/index.html"
                
        try:
            # Try opening file_name file, or with structure
            f = open("./html" + file_name, "rb")
            # with open(file_name, "rb") as f:	content = f.read()	send()
        except:
            # no files found
            response = "HTTP/1.1 404 \r\n"	# For backstage
            response += "\r\n"
            response += "File Not Found"	# To the front end; Fixed information format
            new_socket.send(response.encode("utf-8"))
        else:
            # File opened successfully
            html_content = f.read()
            f.close()
            response = "HTTP/1.1 200 OK\r\n"
            response += "\r\n"
            
            new_socket.send(response.encode("utf-8"))
            new_socket.send(html_content)	# rb, no coding required
            
        new_socket.close()
            
    def main():
        tcp_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
        # Release occupation
        tcp_socket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
        tcp_socket.bind("",7890)	# Must bind
        tcp_socket.listen()
        
        while True:
            new_socket, client_addr = tcp_socket.accept()
    		service_client(new_socket)
            
        tcp_socket.close()
    if __name__=="__main__":
        main()
    

Multi process server

  • They are created in a while loop

Multi process

  • The child process copies the global and local variables of the parent process

    import socket
    import re
    import multiprocessing
    import threading
    
    def service_client(new_socket):
        # Receive browser requests
        request = new_socket.recv(1024).decode("utf-8")
        lines = request.splitlines()	# Split into lines for easy parsing
        # print(lines)
        # Parse the request using regular and return
        file_name = ""
        # The first line below should match index.html
        # GET /index.html HTTP/1.1 
        # The slash starts and the space stops
        ret = re.match(r"[^/]+(/[^ ]*)", lines[0])	# *Represents 0 or more, and may not have a file name
    	if ret:	# /index.html
            file_name = ret.group(1)
            if file_name == "/":
                file_name = "/index.html"
                
        try:
            # Try opening file_name file, or with structure
            f = open("./html" + file_name, "rb")
            # with open(file_name, "rb") as f:	content = f.read()	send()
        except:
            # no files found
            response = "HTTP/1.1 404 \r\n"	# For backstage
            response += "\r\n"
            response += "File Not Found"	# To the front end; Fixed information format
            new_socket.send(response.encode("utf-8"))
        else:
            # File opened successfully
            html_content = f.read()
            f.close()
            response = "HTTP/1.1 200 OK\r\n"
            response += "\r\n"
            
            new_socket.send(response.encode("utf-8"))
            new_socket.send(html_content)	# rb, no coding required
            
        new_socket.close()
            
    def main():	    # Main process
        tcp_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
        # Release after release
        tcp_socket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
        tcp_socket.bind("",7890)	# Must bind
        tcp_socket.listen()
        
        while True:
            new_socket, client_addr = tcp_socket.accept()
            
    		p = multiprocessing.Process(target=service_client,args=(new_socket,))
            p.start()
            
            # Multithreading
            # t = threading.Thread(target=service_client,args=(new_socket,))
            # t.start()
            # You can use vim global substitution: in command mode
            	# :%s/multiprocessing/threading/g
            # You do not need to close new_socket, threads share process resources, which can be solved internally
            
            new_socket.close()	# Process replication is equivalent to using a hard link to point to the same socket file, so the child process is actually new after it is closed_ The socket has not been closed yet, so close() is called here for external negotiation
            
        tcp_socket.close()
    if __name__=="__main__":
        main()
    

Synergetic process

  • [note] the html folder is shared to mnt/hgfs/shareDocument in VM

    from gevent import monkey
    import gevent
    import socket
    import sys
    import re
    
    monkey.patch_all()
    
    class WSGIServer(object):
        """Define a WSGI Server class"""
    
        def __init__(self, port, documents_root):	# 7890 ./html
            # 1. Create socket
            self.server_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
            # 2. Bind local information
            self.server_socket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
            self.server_socket.bind(("", port))
            # 3. Change to listening socket
            self.server_socket.listen(128)
            self.documents_root = documents_root
    
        def run_forever(self):
            """Run server"""
            # Waiting for the other party's link
            while True:
                new_socket, new_addr = self.server_socket.accept()
                gevent.spawn(self.deal_with_request, new_socket)  # Create a collaboration and prepare to run it
    
        def deal_with_request(self, client_socket):
            """Server for this browser"""
            while True:
                # receive data 
                request = client_socket.recv(1024).decode('utf-8')	# block
                # When the browser receives the data, it will automatically call close to close, so when it is closed, the web will also close the socket
                if not request:	# When the client is closed, it will return a null value (in short, there is a return value)
                    new_socket.close()
                    break
    
                request_lines = request.splitlines()
                for i, line in enumerate(request_lines):
                    print(i, line)
    
                # Extract the requested file (index.html)
                # GET /a/b/c/d/e/index.html HTTP/1.1
                ret = re.match(r"([^/]*)([^ ]+)", request_lines[0])
                if ret:
                    print("Regular extraction data:", ret.group(1))
                    print("Regular extraction data:", ret.group(2))
                    file_name = ret.group(2)
                    if file_name == "/":
                        file_name = "/index.html"
    
                file_path_name = self.documents_root + file_name
                try:
                    f = open(file_path_name, "rb")
                except:
                    # If the file cannot be opened, it means that there is no resource. If there is no resource, you need to tell the browser some data
                    # 404
                    response_body = "There are no documents you need......".encode("utf-8")
    
                    response_headers = "HTTP/1.1 404 not found\r\n"
                    response_headers += "Content-Type:text/html;charset=utf-8\r\n"
                    response_headers += "Content-Length:%d\r\n" % len(response_body)
                    response_headers += "\r\n"
                    # The header is not a binary file. Code it and add it
                    send_data = response_headers.encode("utf-8") + response_body
    
                    client_socket.send(send_data)
    
                else:
                    content = f.read()
                    f.close()
                    # Response body information
                    response_body = content
                    # Response header information
                    response_headers = "HTTP/1.1 200 OK\r\n"
                    response_headers += "Content-Type:text/html;charset=utf-8\r\n"
                    response_headers += "Content-Length:%d\r\n" % len(response_body)
                    response_headers += "\r\n"
                    send_data = response_headers.encode("utf-8") + response_body
                    client_socket.send(send_data)
    
    # Set the path when the server serves static resources
    DOCUMENTS_ROOT = "./html"
    
    def main():
        """control web Server as a whole"""
        # python3 xxxx.py 7890
        if len(sys.argv) == 2:
            port = sys.argv[1]	# xxx.py is followed by external parameters
            if port.isdigit():
                port = int(port)
        else:
            print("Operation mode such as: python3 xxx.py 7890")
            return
    
        print("http Server used port:%s" % port)
        http_server = WSGIServer(port, DOCUMENTS_ROOT")
        http_server.run_forever()
    
    if __name__ == "__main__":
        main()
    

Single process non blocking

  • Multi process (multi task) can realize non blocking, and then open a process / thread / CO process service

  • Single process blocking will cause subsequent users to be unable to connect and affect the user experience; Here, a single process is used to realize non blocking

  • Whether single process or multi process, concurrency is considered because the CPU is single core; Multicore can only be parallel

  • In listening mode, accept() and recv() will be blocked; Both can be set to non blocking mode, but the code is written as follows:

    from socket import *
    import time
    
    # socket used to store all new links
    g_socket_list = list()
    
    def main():
        server_socket = socket(AF_INET, SOCK_STREAM)
        server_socket.setsockopt(SOL_SOCKET, SO_REUSEADDR  , 1)
        server_socket.bind(('', 7890))
        server_socket.listen(128)
        # Set the listening socket to non blocking
        server_socket.setblocking(False)
    
        while True:
            # Used to test
            time.sleep(0.5)
            try:
                newClientInfo = server_socket.accept()	# Exception must be reported if the request is not received
            except Exception as result:
                pass
            else:
                print("A new client is coming:%s" % str(newClientInfo))
                # The new client socket is set to non blocking
                newClientInfo[0].setblocking(False)
                g_socket_list.append(newClientInfo)
    
    		# Use the for loop to process the connected client socket every time to see if there is data
            # Because it's not blocked, there may be data coming when you abandoned others last time
            for client_socket, client_addr in g_socket_list:
                try:
                    recvData = client_socket.recv(1024)
                    if recvData:
                        print('recv[%s]:%s' % (str(client_addr), recvData))
                    else:	# When the client is closed, a null value will be returned without exception
                        print('[%s]The client has been shut down' % str(client_addr))
                        # Received b ''
                        client_socket.close()	# The socket can only be closed if the client actively closes it
                        g_socket_list.remove((client_socket,client_addr))	
    			# If an exception is generated, it means that there is no data and cannot be removed. It may be on the road
                except Exception as result:	# It was said that it can also be closed here, and no data has been received
                    # print(result) 	# debug information
                    pass
            print(g_socket_list)
            
    # The operating system caches recv data, that is, there is a buffer
    
    if __name__ == '__main__':
        main()
    
    • Ubuntu installs the network debugging assistant, which is in the opt / directory by default
    • Ubuntu installs Sublime, and Linux can open files using subl fileName

Long short connection

  • Long connection: after three handshakes to establish a connection, the returned data will not be released immediately, waiting for the next data request

    • Due to the increasing abundance of web page elements, short connection means that there are more requests to connect to the server at the same time, resulting in great pressure on the server
  • Short connection: request one data and establish one connection

  • **HTTP1.1 * * all use long connections, but the previous program returns new data_ Socket. Close() is a short connection;

  • According to the previous code, after the server sends back data once, the browser does not know whether the server has finished sending:

    # Just add this sentence
    response_headers += "Content-Length:%d\r\n" % len(response_body)
    # You can learn more about Dahua HTTP protocol
    

Single process non blocking long connection

  • The code is as follows: implement a single process non blocking long connection server

    import time
    import socket
    import sys
    import re
    
    
    class WSGIServer(object):
        """Define a WSGI Server class"""
    
        def __init__(self, port, documents_root):
    
            # 1. Create socket
            self.server_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
            # 2. Bind local information
            self.server_socket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
            self.server_socket.bind(("", port))
            # 3. Change to listening socket
            self.server_socket.listen(128)
            # 4. Non blocking
            self.server_socket.setblocking(False)
            self.client_socket_list = list()
            self.documents_root = documents_root
    
        def run_forever(self):
            """Run server"""
    
            # Waiting for the other party's link
            while True:
                try:
                    new_socket, new_addr = self.server_socket.accept()
                except Exception as ret:
                    print("-----1----", ret)  # for test
                else:
                    new_socket.setblocking(False)
                    # Add a new client to the list
                    self.client_socket_list.append(new_socket)
                    
                # Loop through the new client and receive the request (core code)
                for client_socket in self.client_socket_list:
                    try:
                        # Decoding client requests
                        request = client_socket.recv(1024).decode('utf-8')
                    except Exception as ret:
                        print("------2------", ret)  # for test
                    else:
                        if request:
                            self.deal_with_request(request, client_socket)
                        else:
                            client_socket.close()
                            self.client_socket_list.remove(client_socket)
    
                print(self.client_socket_list)
    
    
        def deal_with_request(self, request, client_socket):
            """Server for this browser"""
            if not request:
                return
    
            request_lines = request.splitlines()
            for i, line in enumerate(request_lines):
                print(i, line)
    
            # Extract the requested file (index.html)
            # GET /a/b/c/d/e/index.html HTTP/1.1
            ret = re.match(r"([^/]*)([^ ]+)", request_lines[0])
            if ret:
                print("Regular extraction data:", ret.group(1))
                print("Regular extraction data:", ret.group(2))
                file_name = ret.group(2)
                if file_name == "/":
                    file_name = "/index.html"
    
    
            # Read file data
            try:
                f = open(self.documents_root+file_name, "rb")
            except:
                response_body = "file not found, Please enter the correct url"
                response_header = "HTTP/1.1 404 not found\r\n"
                response_header += "Content-Type: text/html; charset=utf-8\r\n"
                response_header += "Content-Length: %d\r\n" % (len(response_body))
                response_header += "\r\n"
    
                # Return header to browser
                client_socket.send(response_header.encode('utf-8'))
    
                # Return body to browser
                client_socket.send(response_body.encode("utf-8"))
            else:
                content = f.read()
                f.close()
    
                response_body = content
                response_header = "HTTP/1.1 200 OK\r\n"
                response_header += "Content-Length: %d\r\n" % (len(response_body))
                response_header += "\r\n"
    
                # Encode the header and add the body
                client_socket.send( response_header.encode('utf-8') + response_body)
    
    
    # Set the path when the server serves static resources
    DOCUMENTS_ROOT = "./html"
    
    def main():
        """control web Server as a whole"""
        # python3 xxxx.py 7890
        if len(sys.argv) == 2:
            port = sys.argv[1]
            if port.isdigit():
                port = int(port)
        else:
            print("Operation mode such as: python3 xxx.py 7890")
            return
    
        print("http Server used port:%s" % port)
        http_server = WSGIServer(port, DOCUMENTS_ROOT)
        http_server.run_forever()
    
    
    if __name__ == "__main__":
        main()
    
  • Summary: single process non blocking: connect when it comes and traverse the response

  • What if the previous request takes a long time to process? It's still blocked

epoll

  • The advantage of epoll is that a single process can handle the IO of multiple network connections at the same time

    • It is recognized as the best multi-channel I/O ready notification method under Linux 2.6
  • Principle: it can monitor multiple sockets at the same time. When data arrives in a socket, it will notify the user process

  • Today's nginx is based on this principle

  • Shared memory, event notification

    • Because the for loop essentially copies the request to the kernel and joins the torrent of single core processes waiting for a response
    • epollprinciple
      • Memory mapping (mmap) is shared memory;
      • The file descriptor (fd) here marks the socket file (everything in Linux is a file). The kernel uses a callback mechanism similar to callback to quickly activate this file descriptor. When the process calls epoll_ You are notified when you wait ()
      • epoll can be understood as a special memory + notification mechanism
  • Introduce the process in detail through the code: epoll version of http server

    import socket
    import time
    import sys
    import re
    import select
    
    
    class WSGIServer(object):
        """Define a WSGI Server class"""
    
        def __init__(self, port, documents_root):
    
            # 1. Create socket
            self.server_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
            # 2. Bind local information
            self.server_socket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
            self.server_socket.bind(("", port))
            # 3. Change to listening socket
            self.server_socket.listen(128)
    
            self.documents_root = documents_root
    
            # Create an epoll object (which can be understood as the shared memory)
            self.epoll = select.epoll()
            # Add tcp server socket to epoll for listening
            self.epoll.register(self.server_socket.fileno(), select.EPOLLIN|select.EPOLLET)
    
            # Create sockets corresponding to the added fd. Shared memory exists in each socket corresponding to fd, saving the duplication of sockets
            self.fd_socket = dict()
    
        def run_forever(self):
            """Run server"""
    
            # Waiting for the other party's link
            while True:
                # epoll scans fd -- if no timeout is specified, it is blocking waiting
                epoll_list = self.epoll.poll()
    
                # Judge the event (core code)
                for fd, event in epoll_list:
                    # If the server listens to the socket and can receive data, that is, a new client can accept
                    if fd == self.server_socket.fileno():
                        new_socket, new_addr = self.server_socket.accept()
                        # Register the readable event of connecting socket with epoll
                        self.epoll.register(new_socket.fileno(), select.EPOLLIN | select.EPOLLET)
                        # Record (fd,socket), followed by
                        self.fd_socket[new_socket.fileno()] = new_socket
                    # The previous client received data
                    elif event == select.EPOLLIN:
                        request = self.fd_socket[fd].recv(1024).decode("utf-8")
                        if request:
                            self.deal_with_request(request, self.fd_socket[fd])
                        else:
                            # Unregister client information in epoll
                            self.epoll.unregister(fd)
                            # Close the file handle of the client
                            self.fd_socket[fd].close()
                            # Delete information related to closed clients from the dictionary
                            del self.fd_socket[fd]
    
        def deal_with_request(self, request, client_socket):
            """Server for this browser"""
    
            if not request:
                return
    
            request_lines = request.splitlines()
            for i, line in enumerate(request_lines):
                print(i, line)
    
            # Extract the requested file (index.html)
            # GET /a/b/c/d/e/index.html HTTP/1.1
            ret = re.match(r"([^/]*)([^ ]+)", request_lines[0])
            if ret:
                print("Regular extraction data:", ret.group(1))
                print("Regular extraction data:", ret.group(2))
                file_name = ret.group(2)
                if file_name == "/":
                    file_name = "/index.html"
    
    
            # Read file data
            try:
                f = open(self.documents_root+file_name, "rb")
            except:
                response_body = "file not found, Please enter the correct url"
    
                response_header = "HTTP/1.1 404 not found\r\n"
                response_header += "Content-Type: text/html; charset=utf-8\r\n"
                response_header += "Content-Length: %d\r\n" % len(response_body)
                response_header += "\r\n"
    
                # Return header to browser
                client_socket.send(response_header.encode('utf-8'))
    
                # Return body to browser
                client_socket.send(response_body.encode("utf-8"))
            else:
                content = f.read()
                f.close()
    
                response_body = content
    
                response_header = "HTTP/1.1 200 OK\r\n"
                response_header += "Content-Length: %d\r\n" % len(response_body)
                response_header += "\r\n"
    
                # Return data to browser
                client_socket.send(response_header.encode("utf-8")+response_body)
    
    # Set the path when the server serves static resources
    DOCUMENTS_ROOT = "./html"
    
    def main():
        """control web Server as a whole"""
        # python3 xxxx.py 7890
        if len(sys.argv) == 2:	# Parameter index starts at 1
            port = sys.argv[1]
            if port.isdigit():
                port = int(port)
        else:
            print("Operation mode such as: python3 xxx.py 7890")
            return
    
        print("http Server used port:%s" % port)
        http_server = WSGIServer(port, DOCUMENTS_ROOT)
        http_server.run_forever()
    
    
    if __name__ == "__main__":
        main()
    
  • Summary: why is single process epoll the fastest?
    From multi process / thread / CO process to single process non blocking, and then to epoll, concurrency is essential; Process switching is very resource consuming (multi-channel concurrency), followed by thread concurrency (concurrency in concurrency); Single process concurrency requires copying resources (sockets) to OS scanning (concurrent all the way); Epoll uses shared memory event notification, uses file descriptor to save the cost of resource replication, and uses callback to realize notification, which is faster (all the way concurrent);

  • In other words, epoll can listen to multiple socket requests at the same time, which is more efficient for processing resource time division multiplexing (single process non blocking model)

  • Thinking questions:

    • 4-core CPU, using python to clean data, multi process or multi thread?
    • A: batch data processing is a CPU intensive task. Due to the GIL lock (python can be considered as thread safe; this lock will be released during I/O operation), it will block threads. I choose to create four processes for multitasking

Tags: Python udp TCPIP webserver

Posted on Tue, 14 Sep 2021 18:49:31 -0400 by tambo