Python Create File – How to Append and Write to a Text File
As a full-stack developer, you‘ll frequently need to read and write data to files in your Python backend services. File input/output (I/O) is an essential skill for everything from saving user preferences and caching to logging and data analysis. Virtually every non-trivial Python web app or script will need to persist data to the file system at some point.
In this in-depth guide, we‘ll cover all the key aspects of creating, writing to, appending, and reading files using Python‘s built-in functions and modules. You‘ll see detailed code examples, learn best practices and performance optimizations, and get expert tips for working with files in real-world projects. Let‘s get started!
Creating and Writing Text Files
To create a new file or overwrite an existing one, you use the open()
function in write (‘w‘
) mode:
with open(‘example.txt‘, ‘w‘) as file:
file.write(‘Hello World!\n‘)
file.write(‘This is a new file.\n‘)
The open()
function takes two key arguments – the file path and the mode. For writing to a file, we pass ‘w‘
as the mode. This opens the file for writing and truncates (clears) it if it already exists. If the file doesn‘t exist, open()
will create it.
Using the file handle returned by open()
, we write strings to the file with the write()
method. Unlike the print()
function, write()
does not automatically add newline characters, so you need to include them manually (\n
) to write multiple lines.
It‘s important to always close a file after you‘re done writing to it. The best way to do this is using a with
block as shown above. This ensures the file is closed automatically once the with
block‘s scope ends, even if an exception is raised. If you don‘t use with
, be sure to call close()
on the file handle explicitly.
Appending to Files
Sometimes you want to add to an existing file without overwriting its contents. For this, we use append mode by passing ‘a‘
to open()
:
with open(‘example.txt‘, ‘a‘) as file:
file.write(‘This line is appended.\n‘)
Append mode preserves the current contents and writes any new data to the end of the file. If the file doesn‘t exist, opening it in append mode will create it just like write mode.
Be careful when mixing write and append mode on the same file, as it‘s easy to accidentally overwrite data. Usually it‘s best to pick one mode and stick with it for a given file in your program.
Reading File Contents
Once you‘ve written some data to a file, you‘ll probably want to read it back at some point. For this, we use read mode by passing ‘r‘
to open()
:
with open(‘example.txt‘, ‘r‘) as file:
contents = file.read()
print(contents)
This reads the entire contents of the file into a string using read()
. If you want to process the lines individually, you can use a for
loop:
with open(‘example.txt‘, ‘r‘) as file:
for line in file:
print(line.strip())
This iterates over the file line by line. The strip()
method removes the newline character from the end of each line before printing.
For large files, it‘s often better to process them line by line instead of reading the whole file at once to avoid using too much memory. Python will automatically buffer the file I/O for efficient reading and writing.
Text vs Binary Mode
By default, open()
operates in text mode, which automatically handles line endings and other text formatting details for you based on the platform. On Windows, \r\n line endings are translated to \n; on Unix/Linux, \n line endings are left alone.
However, sometimes you need to read or write binary data like images, sound files, or serialized Python objects. In this case, you need to use binary mode by adding ‘b‘
to the mode string:
with open(‘example.bin‘, ‘wb‘) as file:
file.write(b‘\x00\x01\x02\x03‘)
In binary mode, data is read and written as raw bytes with no formatting translation. When reading a binary file, read()
returns a bytes
object instead of a str
.
Be careful not to use binary mode for text data or vice versa, as this can lead to formatting errors and corrupt data.
Paths and Directories
So far we‘ve used simple filenames that put the file in the same directory as the Python script. For larger projects, you‘ll often need to specify paths to organize files into subdirectories.
Python‘s os
module provides many helpful functions for working with the file system in a cross-platform manner:
import os
# Get current working directory
cwd = os.getcwd()
# Create a new directory
os.mkdir(‘example_dir‘)
# Change current directory
os.chdir(‘example_dir‘)
# Get list of files and subdirectories
files = os.listdir(‘.‘)
# Construct file path
file_path = os.path.join(cwd, ‘example_dir‘, ‘file.txt‘)
with open(file_path, ‘w‘) as file:
file.write(‘File in a subdirectory.\n‘)
Always use os.path.join()
to construct file paths instead of manually concatenating strings. This ensures your code works correctly on different operating systems.
Serializing Objects
Python‘s pickle
module allows you to easily serialize and deserialize Python objects to and from files. This is useful for caching expensive computations or saving complex application state.
Here‘s how to pickle a dictionary to a file:
import pickle
data = {
‘name‘: ‘Alice‘,
‘age‘: 30,
‘hobbies‘: [‘reading‘, ‘running‘, ‘coding‘]
}
with open(‘data.pkl‘, ‘wb‘) as file:
pickle.dump(data, file)
And here‘s how to unpickle it:
with open(‘data.pkl‘, ‘rb‘) as file:
loaded_data = pickle.load(file)
print(loaded_data)
Pickle uses a binary format, so you need to use binary read/write mode. Also, never unpickle data from an untrusted source, as it can lead to arbitrary code execution.
Structured File Formats
In addition to working with raw text and binary files, Python has excellent support for structured formats like CSV, JSON, and XML.
To read and write CSV files, use the built-in csv
module:
import csv
with open(‘example.csv‘, ‘w‘, newline=‘‘) as file:
writer = csv.writer(file)
writer.writerow([‘Name‘, ‘Age‘, ‘City‘])
writer.writerow([‘Alice‘, 30, ‘New York‘])
writer.writerow([‘Bob‘, 25, ‘San Francisco‘])
with open(‘example.csv‘, ‘r‘) as file:
reader = csv.reader(file)
for row in reader:
print(row)
For JSON, use the json
module:
import json
data = {
‘name‘: ‘Alice‘,
‘age‘: 30,
‘city‘: ‘New York‘
}
with open(‘example.json‘, ‘w‘) as file:
json.dump(data, file)
with open(‘example.json‘, ‘r‘) as file:
loaded_data = json.load(file)
print(loaded_data)
These modules handle the details of serializing and parsing the structured formats for you, so you can focus on working with the data itself.
Asynchronous File I/O
In modern async Python web frameworks like FastAPI and Starlette, you‘ll often need to perform file I/O without blocking the main event loop. For this, we can use the aiofiles
library to perform non-blocking file operations:
import aiofiles
async def write_file():
async with aiofiles.open(‘example.txt‘, ‘w‘) as file:
await file.write(‘Hello async world!\n‘)
async def read_file():
async with aiofiles.open(‘example.txt‘, ‘r‘) as file:
contents = await file.read()
print(contents)
The aiofiles
API is very similar to the standard open()
function, but uses async with
and await
to perform the I/O asynchronously. This allows your async web server to handle many concurrent file operations without blocking.
File I/O Performance Tips
When working with large files or many small files in performance-critical code, optimizing your file I/O can have a big impact. Here are a few tips:
-
Use buffering to read and write data in chunks instead of one byte at a time. The default buffer size is usually sufficient (4096 or 8192 bytes on most systems), but you can experiment with different sizes to find the optimum.
-
When processing structured data, consider binary formats like
pickle
,msgpack
, orparquet
instead of JSON or CSV. Binary formats are much more compact and faster to serialize and parse. For example,msgpack
is up to 50x faster thanpickle
and 5-10x faster than JSON. -
If you need to read an entire file into memory, use
read()
instead ofreadlines()
or iterating over the file object.read()
is implemented in C and can be much faster for large files. -
Be careful with global interpreter lock (GIL) contention in multithreaded code that does heavy file I/O. The GIL can cause threads to block waiting for I/O, negating the benefits of concurrency. Consider using multiprocessing or async I/O instead for CPU-bound or I/O-bound workloads, respectively.
-
Profile and measure before optimizing. Use Python‘s built-in
cProfile
module or third-party tools likepy-spy
to identify file I/O bottlenecks in your code. Don‘t waste time optimizing code paths that aren‘t performance-critical.
Real-World Examples
To solidify these file I/O concepts, let‘s look at a few real-world examples you‘re likely to encounter as a Python web developer.
Web Scraping
When scraping websites, you‘ll often want to cache the HTML locally to avoid re-fetching it on every run. Here‘s a simple caching function using file I/O:
import os
import requests
def cached_get(url, cache_dir=‘cache‘):
filename = url.split(‘/‘)[-1]
filepath = os.path.join(cache_dir, filename)
if os.path.exists(filepath):
with open(filepath, ‘r‘) as file:
content = file.read()
else:
os.makedirs(cache_dir, exist_ok=True)
response = requests.get(url)
content = response.text
with open(filepath, ‘w‘) as file:
file.write(content)
return content
This function takes a URL and a cache directory, fetches the contents of the URL, and saves it to a file in the cache directory. On subsequent calls with the same URL, it reads the cached file instead of re-fetching the data. This can significantly speed up scrapers that revisit the same pages often.
Data Processing
In data-heavy web services, you‘ll frequently need to export data from a database to a file for analysis or import data from files into a database. Here‘s an example of exporting MySQL data to a CSV file:
import csv
import mysql.connector
db = mysql.connector.connect(
host=‘localhost‘,
user=‘user‘,
password=‘password‘,
database=‘example‘
)
cursor = db.cursor()
cursor.execute(‘SELECT * FROM users‘)
with open(‘users.csv‘, ‘w‘, newline=‘‘) as file:
writer = csv.writer(file)
writer.writerow([i[0] for i in cursor.description])
writer.writerows(cursor)
db.close()
This code connects to a MySQL database, executes a SELECT
query, and writes the results to a CSV file using the csv
module. The cursor.description
attribute provides the column names for the header row.
You can adapt this pattern to export data from any database to any file format, or to import data from files into a database using SQL INSERT
statements.
Logging
Proper logging is crucial for debugging and monitoring production web services. Python‘s built-in logging
module makes it easy to write logs to files:
import logging
logging.basicConfig(
filename=‘example.log‘,
level=logging.DEBUG,
format=‘%(asctime)s %(levelname)s: %(message)s‘
)
logging.debug(‘This is a debug message‘)
logging.info(‘This is an info message‘)
logging.warning(‘This is a warning message‘)
logging.error(‘This is an error message‘)
logging.critical(‘This is a critical message‘)
This configures the logging
module to write logs to a file called example.log
with a custom format including the timestamp, log level, and message. You can then use the different logging methods (debug()
, info()
, etc.) to write messages at different severity levels.
The logging
module supports rotating log files, different handlers for different log levels, and much more. See the official docs for details.
Conclusion
In this comprehensive guide, we‘ve covered everything you need to know to work with files effectively in Python. Some key takeaways:
- Use
open()
in write mode (‘w‘
) to create or overwrite a file, append mode (‘a‘
) to add to an existing file, and read mode (‘r‘
) to read a file‘s contents. - Always close files after you‘re done with them, preferably using a
with
block to ensure cleanup even if an exception is raised. - Use binary mode (
‘b‘
) for non-text data like images, archives, and serialized objects. - The
os
andos.path
modules provide portable functions for working with files and directories across different operating systems. - Python has built-in support for reading and writing structured file formats like CSV and JSON using the
csv
andjson
modules. - When working with async frameworks, use libraries like
aiofiles
for non-blocking file I/O. - Optimize file I/O by using buffering, minimizing disk seeks, and choosing efficient file formats for your use case.
With these skills and best practices, you‘re well-equipped to handle any file-related tasks in your Python web dev projects. So get out there and start reading and writing!
For further reading, check out Python‘s official open()
docs, File and Directory Access, and Structured File Formats chapters. The RealPython File I/O Guide and Google‘s Python File I/O Tutorial are also excellent resources to deepen your understanding.
Happy coding, and may your file handling be bug-free and performant!