Secure password processing in Python is very important! Knowing these hackers can't help you!

Almost every application requires some form of authentication, password processing, or the use of secure credentials such as API keys. You may not be a security expert, but you should know how to securely handle all these passwords and credentials to protect the credentials and data of application users, as well as your own API keys and various tokens.

Ensuring the security of these security elements includes generating them, verifying them, storing them safely, and protecting them from enemies. Therefore, in this article, we will explore Python libraries, tools and concepts, which will play the same role!

Prompt password

Let's get started -- you have a basic Python application with a command line interface. You need to ask the user for a password. You can use input(), but this will display the password in the terminal to avoid using getpass. Instead:

import getpass

user = getpass.getuser()
password = getpass.getpass()
# Do Stuff...

getpass is a very simple package that allows you to prompt the user for a password and get their user name by extracting the current user's login name. Note, however, that not every system supports hidden passwords. Python will try to warn about this, so just read the warning on the command line.

generate

Sometimes it's better to generate a password rather than prompt the user for it. For example, if you want to set the initial password to change when you log in for the first time.

There is no password generation library, but it is not difficult to implement:

import string
import secrets

length = 15
# Choose wide set of characters, but consider what your system can handle
alphabet = string.ascii_letters + string.digits + string.punctuation
password = ''.join(secrets.choice(alphabet) for i in range(length))

The password generated using the above code will be strong, but difficult to remember. If it's just an initial, temporary password or transient token, it's OK, but if the user should use a longer password, it's more appropriate to use a password.

We can build a password generator like using a simple password above, but why bother if there is a library available? This library is called xkcdpass, and the famous XKCD about password strength does exactly what the cartoon describes -- it produces a powerful password composed of words:

# pip install xkcdpass
from xkcdpass import xkcd_password as xp

word_file = xp.locate_wordfile()
words = xp.generate_wordlist(wordfile=word_file, min_length=5, max_length=10)

for i in range(4):
    print(xp.generate_xkcdpassword(words, acrostic="python", numwords=6, delimiter="*"))

# punch*yesterday*throwback*heaviness*overnight*numbing
# plethora*yesterday*thigh*handlebar*outmost*natural
# pyromania*yearly*twisty*hyphen*overstuff*nuzzle
# pandemic*yearly*theology*hatching*overlaid*neurosis

This fragment first finds a word / dictionary file in your system, such as / usr/dict/words, and selects all words of the specified length, and then generates a list of words from which to generate a passphrase. The generator itself has some parameters that we can use to customize the passphrase. In addition to the obvious number and length of words, it also has

Extremities

Parameter, which word character will be used as the first letter of the word in the password (sounds complicated?) well, see the password example above).

If you really want to build it yourself instead of adding dependencies to your project, you can

hash

Now that we have asked users for passwords or generated passwords for them, what should we do with it? We may want to store it somewhere in the database, but you may (want) to know that you should not store passwords in clear text format. Why is that?

Then, passwords should not be stored in a recoverable format, whether plain text or encrypted. They should be hashed using strongly encrypted one-way functions. In this way, if someone knows the password in the database, it will be difficult for them to recover any actual password, because the only way to recover any password from the hash is to forcibly -- that is, use the possible plaintext passwords, hash them with the same algorithm, and compare the results with the entries in the database.

In order to make brute force more difficult, in addition

salt

It should be. SALT is a random string stored next to the hash password. Before hashing, it is appended to the password, which makes it more random and therefore difficult to guess (using the rainbow table)

However, because modern hardware can try hashes billions of times per second, it is not enough to guess by password alone.

Slow

Hash function is used for password hashing, which makes it much less efficient for attackers to forcibly use passwords.

(Note: the above greatly simplifies the logic and reasons for using these hash functions. For more thoughtful explanations, see the article.)

There are quite a number of libraries and separate hash algorithms, but the above requirements greatly narrow our choice. The solution for hashing in Python should be passlib, because it provides the correct algorithm and high-level interface, which can be used even by those who are not proficient in cryptography.

# pip install passlib
from passlib.hash import bcrypt
from getpass import getpass

print(bcrypt.setting_kwds)
# ('salt', 'rounds', 'ident', 'truncate_error')
print(bcrypt.default_rounds)
# 12

hasher = bcrypt.using(rounds=13)  # Make it slower

password = getpass()
hashed_password = hasher.hash(password)
print(hashed_password)
# $2b$13$H9.qdcodBFCYOWDVMrjx/uT.fbKzYloMYD7Hj2ItDmEOnX5lw.BX.
# \__/\/ \____________________/\_____________________________/
# Alg Rounds  Salt (22 char)            Hash (31 char)

print(hasher.verify(password, hashed_password))
# True
print(hasher.verify("not-the-password", hashed_password))
# False

Among the fragments we use, bcrypt is our chosen algorithm because it is one of the most popular and fully tested hash algorithms. First, we examine its possible settings and the default number of rounds used by the algorithm. Then modify

Hashir

Using more rounds (cost factor) slows down the hash, so the hash is more difficult to crack. This number should be the largest and will not cause intolerable delays (~ 300 ms) to your users. passlib periodically updates the default loop value, so it is not necessary to change this value.

When HASHER is ready, we prompt the user for the password and hash it. At this point, we can store it in the database. For demonstration, we continue to verify it with the original plaintext password.

From the above code, we can see that passlib comes down to hash and modify the method selected by our algorithm. However, if you want more control over plans, rounds, etc., you can use CryptContext:

from passlib.context import CryptContext
ctx = CryptContext(schemes=["bcrypt", "argon2", "scrypt"],
                   default="bcrypt",
                   bcrypt__rounds=14)

password = getpass()
hashed_password = ctx.hash(password)
print(hashed_password)
# $2b$14$pFTXqnHjn91C8k8ehbuM.uSJM.H5S0l7vkxE8NxgAiS2LiMWMziAe

print(ctx.verify(password, hashed_password))
print(ctx.verify("not-the-password", hashed_password))

This context object allows us to work with multiple scenarios, set defaults, or configure cost factors. This may not be necessary if your application authentication is simple, but if you need the ability to use multiple hash algorithms, deprecate them, rehash, or similar advanced tasks, you may need to view the full text. CryptContext integration tutorial

Another reason you want to use CryptContext if you need to handle operating system passwords, such as / etc/shadow. To do this, you can use passlib.hosts, see the example for details. here.

For completeness, I also listed several other available libraries, including their (different) use cases:

  • Bcrypt is the library and algorithm we used above. This is the same code used by the following people: passlib has no real reason to use this low-level library.
  • The cellar is a Python standard library module that provides information that can be used for password hashing. However, the algorithms provided depend on your system, and the algorithms listed in the documentation are not as strong as those shown above.
  • Hashlib is another built-in module. However, this one contains powerful hashing function, which is suitable for password hashing. The interface of this library makes functions more customizable, so more knowledge is needed to use them correctly (safely). You can definitely use functions from this module, such as hashlib.scrypt your password.
  • HMAC, the last hash module provided by the Python standard library, is not suitable for password hashing. HMAC is used to verify the integrity and authenticity of messages and does not have the attributes required for password hashing.

Note: with the newly acquired knowledge of how to store passwords correctly, let's imagine that you have forgotten the passwords for some services. You click

"Forgot the password?"

Instead of restoring links on the website, they give you the actual password. This means that they store your password in clear text, which also means that you should escape the service (if you use the same password elsewhere, change it).

Safe storage

In the previous section, we assumed that the purpose is to store the credentials of other users, but what about the password you use to log in to the remote system?

Keeping the password in the code is obviously a bad choice because it is available to anyone in clear text, and you may accidentally push the password to gitrepo. A better option is to store it in an environment variable. You can create an. env file, add it to. gitignore, and populate it with the credentials required for the current project. Then you can package all these variables into the application with dotenv, as follows:

# pip install python-dotenv
import os
from os.path import join, dirname
from dotenv import load_dotenv

dotenv_path = join(dirname(__file__), ".env")
load_dotenv(dotenv_path)

API_KEY = os.environ.get("API_KEY", "default")

print(API_KEY)
# a3491fb2-000f-4d9f-943e-127cfe29c39c

This fragment is first built into the. env file, using os.path functions, and then uses these functions to load environment variables. load_dotenv(). If your. env file is in the current directory, as shown in the above example, you can simplify the code by calling load_dotenv(find_dotenv()) automatically finds the environment file. When loading the file, all that remains is to use os.environ.get

Or, if you don't want to pollute your environment with application variables and secrets, you can load them directly like this:

from dotenv import dotenv_values

config = dotenv_values(".env")
print(config)
# OrderedDict([('API_KEY', 'a3491fb2-000f-4d9f-943e-127cfe29c39c')])

The above solution is good, but we can do better. Instead of storing passwords in unprotected files, we can also use the system's

Key ring

, the application can store security credentials in an encrypted file in the home directory. By default, the file is encrypted with the user account login password, so it will be unlocked automatically when logging in, so you don't have to worry about additional passwords.

To use keyring credentials in a Python application, we can use a credential named keyring:

# pip install keyring
import keyring
import keyring.util.platform_ as keyring_platform

print(keyring_platform.config_root())
# /home/username/.config/python_keyring  # Might be different for you

print(keyring.get_keyring())
# keyring.backends.SecretService.Keyring (priority: 5)

NAMESPACE = "my-app"
ENTRY = "API_KEY"

keyring.set_password(NAMESPACE, ENTRY, "a3491fb2-000f-4d9f-943e-127cfe29c39c")
print(keyring.get_password(NAMESPACE, ENTRY))
# a3491fb2-000f-4d9f-943e-127cfe29c39c

cred = keyring.get_credential(NAMESPACE, ENTRY)
print(f"Password for username {cred.username} in namespace {NAMESPACE} is {cred.password}")
# Password for username API_KEY in namespace my-app is a3491fb2-000f-4d9f-943e-127cfe29c39c

In the above code, we first check the location of the keyring configuration file, which is where you can make some configuration adjustments when necessary. Then we check the active key ring and continue to add a password to it. Each entry has 3 attributes-

Service, user name and password, where the service acts as a namespace, which in this case will be the name of the application. To create and retrieve entries, we simply use set_password and get_password respectively. In addition, get_credential can be used - it returns a credential object that has the properties of user name and password.

Closed thought

Even if you are not a security expert, you are still responsible for the basic security features of the application you build. This includes good handling of user data, especially passwords, so I hope these examples and recipes can help you do this.

In addition to the methods and techniques shown in this article, the best way to deal with passwords is to avoid using passwords entirely by delegating authentication to OIDC providers (such as Google or GitHub), or replacing them with key based authentication and encryption, which we will discuss in the next article.

Tags: Python Programmer crawler

Posted on Mon, 18 Oct 2021 23:17:16 -0400 by Aleks