In Python, alphabetizing strings is case-sensitive. (That's actually true for most programming languages.) And while that may seem obvious to those who have been programming for a while, it's not exactly intuitive.
Case-sensitive ordering
When sorting, uppercase letters come "alphabetically" before lowercase letters. In the following sorted list, "DOG" ends up first while "dog" ends up last:
my_list = ["dog", "DoG", "DOG", "Dog"]
sorted(my_list) # ['DOG', 'DoG', 'Dog', 'dog']
This is because string sorting is based on ASCII:
![](https://streamofcoding.com/content/images/2021/08/00_ascii_table.png)
The ASCII ordinal values 65 through 90 represent the uppercase letters, and 97 through 122 represent the lowercase ones. Python's built-in ord
function returns the ASCII ordinal value for any character:
ord("D") # 68
ord("d") # 100
When sorting strings, Python converts each character in the string to its ordinal ASCII value and sorts by that. Since the uppercase letters have lower ordinal values, they come first.
Case-insensitive ordering
But sometimes you need case-insensitive alphabetical ordering. Then what? In this case, pass a custom sort key to ignore the case.
In Python 2, this is done using str.lower
or str.upper
:
sorted(my_list, key=str.lower) # ['dog', 'DoG', 'DOG', 'Dog']
In Python 3, use str.casefold
:
sorted(my_list, key=str.casefold) # ['dog', 'DoG', 'DOG', 'Dog']
The str.casefold
is better than str.lower
because it aggressively lowercases strings, including properly handling non-English characaters.
(Note: Python sorts are stable, so the final ordering of these lists will depend on the initial order of the elements. Since "DOG" and "dog" are alphabetically the same when case is ignored, the string that came first in the unsorted input will come first in the sorted output too.)