Understanding Python Data Types: String and Integer
Variables in Python can store different types of data. String represents a text type, integer, float and complex number are numeric types. This article focuses on string and integer data types and, especially, on problematics to convert from string to integer including methods, error handling, practices and tips.
Firstly, definitions of both types of data should be done. String could be considered as a sequence of character data. Here it is important to point out that single quotes, double quotes, and triple quotes are used in a Python code to generate strings with varying content. In its turn, integer is a sequence of numeric symbols representing negative or positive whole number without decimal.
Common Methods for String-to-Integer Conversion in Python
There are few approaches to convert string to integer:
- int() function
- eval() function
- literal_eval() function
- usage of specialized libraries
- usage of regular expressions (regex)
int() function
The most popular way to convert string to integer in Python is to use the int() function. This function is a fundamental built-in Python function, which is utilized to convert a given value into an integer. Besides string representing a numeric value, it accepts also floating-point and integer numbers. The syntax of the int() function looks following:
- int(x [,base]))
x is a value to convert to an integer. base is an optional parameter, which denotes the base of the provided number and is used only when x is a string: default is 10 which corresponds to decimal numbers.
Examples how to use the int() function is shown in Figure 1, outputs of the example code are presented in Figure 2.
Figure 1. Example application of int() function
Figure 2. Outputs of example code concerning int() function application
eval() function
The built-in eval() function allows to assess arbitrary Python expressions based on either string or compiled code. This function is chosen when the dynamic evaluation of Python expressions is needed. It compiles the string into bytecode and interprets it as a Python expression, effectively converting strings to integers. However, the primary disadvantage of the eval() function is its capability to execute arbitrary code, which results sometimes in unexpected outcomes. That is why, besides the universality of this approach, it is the least safe one among others. Thus, it is necessary to validate input data: its type, format and limitations.
Examples how to use the eval() function is shown in Figure 3, outputs of the example code are presented in Figure 4.
Figure 3. Example application of eval() function
Figure 4. Outputs of example code concerning eval() function application
literal_eval() function
The literal_eval() function is a Python feature from the ast module. This module helps Python applications to process trees of the Python abstract syntax grammar. The literal_eval() function assesses in the safe way expression nodes or strings, which contain only strings, numbers, bytes, tuples, dictionaries, lists, sets, Booleans, and None. While it is not the best approach to convert strings to integers, this function allows to do that. Before the usage of the literal_eval() function, the ast module should be imported.
The primary benefit of literal_eval() compared to above-mentioned functions is its reputation to be the safer choice. In comparison with eval(), the literal_eval() function does not allow to execute arbitrary code. Additionally, it demonstrates better error handling capabilities in contrast with the int() function because it returns more specific exceptions.
Examples how to use the literal_eval() function is shown in Figure 5, outputs of the example code are presented in Figure 6.
Figure 5. Example application of literal_eval() function
Figure 6. Outputs of example code concerning literal_eval() function application
astype() function from the Numpy library
There are different Python libraries, which often provide convenient functions to extract and/ or convert numbers from strings, especially when they are parts of structured data. As an example, the astype() function from the Numpy module could be considered. If data is stored in an array or matrix, this function makes its copy giving an opportunity to select the data type of its elements (for integers the parameter dtype should be equal to ‘i’). The possible weakness here is that the final output is still presented in the form of array or matrix, which requires additional steps to extract certain elements.
Examples how to use the astype() function is shown in Figure 7, outputs of the example code are presented in Figure 8.
Figure 7. Example application of astype() function
Figure 8. Outputs of example code concerning astype() function application
Regular expressions (regex)
To parse numbers from complex strings containing numeric and non-numeric characters in Python, the first choice seems to be the usage of regular expressions (Python module re). This approach allows to search within the string for numeric symbols or numbers with the help of specific number patterns and then extract and convert them to integers. The main challenge here is to build an appropriate pattern, which could strongly depend on the input data. Also it should be mentioned that to convert extracted string values the usage of such a function as, for example, int() is required additionally.
Examples how to use a regular expression is shown in Figure 9, outputs of the example code are presented in Figure 10.
Figure 9. Example application of a regular expression
Figure 10. Outputs of example code concerning regular expression application
Tips when Converting Strings to Integers
There are few typical tips, which should be taken in mind when converting strings to integers. Firstly, a string value should correspond exactly to an integer value. It means that the presence of any non-numeric characters will return an error, for example: int(’17a’). Of course, an exception here is a space (e.g. int(‘ 17a ‘)): it does not impact on the final output. This exception concerns all above-mentioned functions. Secondly, if there are zeros in the beginning of numeric string value (e.g. ‘00017’), the int() function deletes them automatically when converting to an integer value (the output will be 17). In cases of other discussed functions such a situation will highly likely return an error. Thirdly, the often mistake is that float numbers are confused with integer numbers, although they are two different types of data. For example, the conversion of a float number by using the int() function returns an error. That is why in such a case, before getting an integer value, it should be a step to convert a string value to a floating-point number.
Error Handling During Conversion
Finally, two ways to handle errors during conversion should be discussed.
isdigit() function
The first way is to use the isdigit() function to validate whether a string contains only numeric symbols or not. If this function returns True, then the usage of such a function like int() will not produce any errors. Nevertheless, the presence of a space symbol in a string will push the isdigit() function to return False. Example of the isdigit() function usage together output is shown in Figure 11.
Figure 11. Example application of isdigit() function including output
try-except block
Another way is to apply a try-except block to catch errors, which could arise because of inappropriate string input when converting to an integer number. In comparison with the first option, this approach is more universal and errors depend only on used functions (and not on an inputted string). Example of the try-except block usage together output is shown in Figure 12.
Figure 12. Example application of try-expect block including output
FAQ
How do I convert a string to an integer in Python?
There are several approaches to convert a string to an integer in Python.
The basic one is the int() function, which converts a given value into an integer. Its input values should be a numeric string, which will convert to the corresponding integer value.
The second option is the eval() function, which evaluates arbitrary Python expressions. It makes a dynamic evaluation of Python expressions together with converting strings to integers. However, as eval() can execute arbitrary code, it may lead to unexpected results – this fact requires special attention.
The third approach is the literal_eval() function from the ast module. This function evaluates safely expression nodes or strings containing only specific data types, such as strings, numbers, tuples, lists, dictionaries, sets, Booleans, and None. Although the literal_eval() function was not created for the purpose of string-to-integer conversion, it could be used for that.
There are different Python libraries, which could help to make a conversion from strings to integers. As an example, the astype() function from the Numpy module could be considered, which is useful especially for structured data scenarios.
The last popular option is regular expressions, which can be used to parse numbers from complex strings consisting of numeric and non-numeric characters. However, building appropriate patterns could be associated with a separate challenge.
What are common mistakes to avoid in string-to-int conversion?
There are typical mistakes, which should be avoided when converting strings to integers in Python and, as a consequence, will ensure accurate and error-free conversion.
Firstly, the exact correspondence should be guaranteed. It means that a string value should precisely represent an integer value. Any presence of non-numeric symbols could cause an error.
Secondly, a special attention should be paid on zeros situated in the beginning of a numeric string value. In case of the int() function, these zeros will be removed automatically during the conversion process. However, when using another option to convert strings to integers, unexpected results including errors could arise.
Thirdly, the difference between float and integer data types should be remembered. For example, an attempt to convert a float number to an integer one using the int() function will cause an error. Thus, it is important to process float numbers separately: first of all, it is necessary to convert such string values to a floating-point number and only then to integer ones.
Fourthly, each input data should be validated before performing the conversion. It will ensure that this data corresponds to the expected format and constraints. The validation consists of checking the data type, format, and any limitations, which could appear by the selected conversion approach.
Additionally, special attention should be paid to the eval() function. While it evaluates Python expressions dynamically, there are risks because of its ability to execute arbitrary code. Therefore, it is necessary to verify input data to minimize such risks.
Only usage of best practice and avoidance of above-mentioned mistakes could ensure accurate and reliable conversion of strings to integers in Python.