GCC Newspaper
JUNE 15, 2026
Date
/
Architectures
Components
Topics
News & Policy
Other
ada

Ada: Fix bug when reading multibyte utf-8 character

A bug in the Ada compiler that caused issues when reading multibyte UTF-8 characters from a terminal has been fixed.

This commit fixes an issue in the Ada compiler that occurred when reading multibyte UTF-8 characters from a terminal. The get_immediate function, used for reading characters, was incorrectly interpreting multibyte UTF-8 characters as negative values when the char type was signed. This resulted in range check failures. The fix involves changing the variable type to unsigned char to avoid the signed conversion.

In Details

The bug resided in sysdep.c, within the getc_immediate_common function. This function uses read() to obtain characters from a terminal. The issue arose because the character was read into a signed char, leading to negative values for UTF-8 characters with the MSB set. The fix corrects the type to unsigned char. The relevance may not be immediately obvious without understanding the Ada runtime's character input mechanisms.

For Context

This commit addresses a bug in the Ada compiler related to handling text input. UTF-8 is a character encoding that represents characters using one or more bytes. Multibyte UTF-8 characters are needed to represent a wide range of characters from different languages. The Ada compiler incorrectly interpreted these multibyte characters when reading them directly from the terminal, leading to errors. This commit fixes the issue by ensuring that the characters are read as unsigned values, preventing misinterpretation and allowing the compiler to handle UTF-8 input correctly.

Filed Under: adautf-8bugfix