Python String Format

Introduction

Python string format has been widely used to control variables in the string and format the string in a way that the user prefers. However, in practice, the strings printed out still do not look beautiful for various reasons such as bad text alignment and insufficient free spaces.

In this blog post, I am going to describe the general rule of using Python string format, and how to use it to print beautiful strings to console for machine learning and data science projects.

Basic Python String Format Syntax

Syntax

Although the Python string format syntax could be more complicated, I think the following syntax might be sufficient for most of the projects involving scientific computing.

1
{id : char_to_fill alignment sign width comma num_decimals data_type}

Instruction

Token Optional Explanation
id Yes The id of the string format placeholder.
padding_char Yes The character used for filling the padding spaces at the start and the end of the string. If no character is given, empty space will be used.
alignment Yes ^ is align center; < is align left; > is align right.
sign Yes If + is used, + or - would be used for positive and negative values, respectively.
width Yes The width of the whole string. If the width is larger than the length of the string to be print, padding_char will be used.
comma Yes If , is used, large numbers will have commas as separator.
num_decimals Yes The number of decimals for floating numbers. Has to be of format .n where n is an integer.
data_type Yes s is string, f is floating number, d is integer number.

Example

If we run the following code in Python,

1
2
example_line = "|{pi:@^+25,.8f}|".format(pi=314159.26)
print(example_line)

The message printed to the console would be

1
|@@@@+314,159.26000000@@@@|

Python String Format for Machine Learning and Data Science

We would use the following Python generator to generate fake machine learning training statistics for illustration.

1
2
3
4
5
6
7
# Generate fake training statistics
def gen_func(n):
loss_max = 10000.0
accuracy_max = 1.0
for i in range(n):
# epoch, training loss, training accuracy
yield i, (1-(i+1)/n)*loss_max, (i+1)/n*accuracy_max

The following Python code could be used to print the aligned training statistics to console automatically, as long as the variable header_items, and width were given.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
train_op = gen_func(n=10)
header_items = ["Epoch", "Loss", "Accuracy"]
width = 60

dash = "-" * width
column_width = width // len(header_items)
column_width_items = [column_width] * len(header_items)
header_format_content = [None] * (len(header_items) + len(column_width_items))
header_format_content[::2] = header_items
header_format_content[1::2] = column_width_items
# Expand list using asterisk
# We could have {} inside {}
header = "{:^{}s}{:^{}s}{:^{}s}".format(*header_format_content)
print(dash)
print(header)
print(dash)
for (epoch, loss, accuracy) in train_op:
line = "{:^{}d}{:^{}.4f}{:^{}.2%}".format(epoch, column_width, loss, column_width, accuracy, column_width)
print(line)
print(dash)

The aligned training statistics printed out would be

1
2
3
4
5
6
7
8
9
10
11
12
13
14
------------------------------------------------------------
Epoch Loss Accuracy
------------------------------------------------------------
0 9000.0000 10.00%
1 8000.0000 20.00%
2 7000.0000 30.00%
3 6000.0000 40.00%
4 5000.0000 50.00%
5 4000.0000 60.00%
6 3000.0000 70.00%
7 2000.0000 80.00%
8 1000.0000 90.00%
9 0.0000 100.00%
------------------------------------------------------------

Reference

Author

Lei Mao

Posted on

10-26-2019

Updated on

10-26-2019

Licensed under


Comments