Discussion:
bug#33403: [Geiser-users] Data length limit in Guile/Geiser/Scheme evaluation
Neil Jerram
2018-11-15 22:33:42 UTC
Permalink
Hi, this is a report for Guile 2.2:

***@henry:~$ guile --version
guile (GNU Guile) 2.2.3
Packaged by Debian (2.2.3-deb+1-3ubuntu0.1)

I'm seeing something that looks like a line or sexp length limit when
reading from a terminal. Sample inputs are in the attached file.
Mark H Weaver
2018-11-16 06:49:39 UTC
Permalink
Hi Neil,
Post by Neil Jerram
guile (GNU Guile) 2.2.3
Packaged by Debian (2.2.3-deb+1-3ubuntu0.1)
I'm seeing something that looks like a line or sexp length limit when
reading from a terminal. Sample inputs are in the attached file.
If I run guile in a terminal (GNOME Terminal 3.28.2), select the first
block from the file, and use my middle mouse button to paste it into the
$4 = 139
If I do the same with the second block, I get no response, and it
appears that Guile has hung in some way. I have to type C-c to get a
User interrupt
The max line length for the first block is 4087. For the second it's
4113. Could there be a 4K buffer or limit involved somewhere?
Indeed, I can reproduce the same issue when pasting into an Emacs shell
buffer. I've verified that Guile only receives the first 4095 bytes of
the first line. The following characters from the end of the first line
are lost:

A AAAA" "Aub")))))

So the second and third lines of the input become part of the string
literal whose closing quote was lost, and Guile's reader continues to
wait for a closing quote.

If, after pasting this, you type another close quote, 5 close parens,
and then repaste the last two lines, it will print the garbled input and
return to a prompt.

Anyway, to make a long story short, after some debugging, I found that
precisely the same truncation of the first line happens when using 'cat'
from GNU coreutils. Simply type 'cat' and paste the same text, and
you'll see that in the output, only the first 4095 bytes of the first
line were retained.

So, I'm not sure where the problem is, but it's not a problem in Guile.

Regards,
Mark
Mark H Weaver
2018-11-16 07:13:40 UTC
Permalink
Post by Mark H Weaver
If, after pasting this, you type another close quote, 5 close parens,
and then repaste the last two lines, it will print the garbled input and
return to a prompt.
Actually, instead of pasting the last two lines as-is, I replaced
"(length classification)" with "classification", so that instead of
printing the length, it prints the actual s-exp. Then you can see what
happened to that final string literal.
Post by Mark H Weaver
Anyway, to make a long story short, after some debugging, I found that
precisely the same truncation of the first line happens when using 'cat'
from GNU coreutils. Simply type 'cat' and paste the same text, and
you'll see that in the output, only the first 4095 bytes of the first
line were retained.
So, I'm not sure where the problem is, but it's not a problem in Guile.
This is a documented limitation in Linux's terminal handling when in
canonical mode. See the termios(3) man page, which includes this text:

Canonical and noncanonical mode

The setting of the ICANON canon flag in c_lflag determines
whether the terminal is operating in canonical mode (ICANON set)
or noncanonical mode (ICANON unset). By default, ICANON is set.

In canonical mode:

* Input is made available line by line. An input line is
available when one of the line delimiters is typed (NL, EOL,
EOL2; or EOF at the start of line). Except in the case of EOF,
the line delimiter is included in the buffer returned by
read(2).

* Line editing is enabled (ERASE, KILL; and if the IEXTEN flag is
set: WERASE, REPRINT, LNEXT). A read(2) returns at most one
line of input; if the read(2) requested fewer bytes than are
available in the current line of input, then only as many bytes
as requested are read, and the remaining characters will be
available for a future read(2).

* The maximum line length is 4096 chars (including the
terminating newline character); lines longer than 4096 chars
are truncated. After 4095 characters, input processing (e.g.,
ISIG and ECHO* processing) continues, but any input data after
4095 characters up to (but not including) any terminating
newline is discarded. This ensures that the terminal can
always receive more input until at least one line can be read.

Note that last item above.

Mark
Neil Jerram
2018-11-16 10:44:57 UTC
Permalink
Post by Mark H Weaver
This is a documented limitation in Linux's terminal handling when in
Canonical and noncanonical mode
The setting of the ICANON canon flag in c_lflag determines
whether the terminal is operating in canonical mode (ICANON set)
or noncanonical mode (ICANON unset). By default, ICANON is set.
[...]
Post by Mark H Weaver
* The maximum line length is 4096 chars (including the
terminating newline character); lines longer than 4096 chars
are truncated. After 4095 characters, input processing (e.g.,
ISIG and ECHO* processing) continues, but any input data after
4095 characters up to (but not including) any terminating
newline is discarded. This ensures that the terminal can
always receive more input until at least one line can be read.
Note that last item above.
Awesome; thank you Mark.

So possibly this limit can be removed, in my Org/Geiser context, by
evaluating (system* "stty" "-icanon") when initializing the Geiser-Guile
connection. I'll try that. Will the terminal that that 'stty' sees be
the same as Guile's stdin?

Jao, if that works, I wonder if it should be the default for Geiser? It
appears to me that Geiser shouldn't ever need the features of canonical
mode. Is that right?

Anyway, I'll see first if the stty call is effective.

Neil
Neil Jerram
2018-11-16 11:16:56 UTC
Permalink
Post by Neil Jerram
Post by Mark H Weaver
This is a documented limitation in Linux's terminal handling when in
Canonical and noncanonical mode
The setting of the ICANON canon flag in c_lflag determines
whether the terminal is operating in canonical mode (ICANON set)
or noncanonical mode (ICANON unset). By default, ICANON is set.
[...]
Post by Mark H Weaver
* The maximum line length is 4096 chars (including the
terminating newline character); lines longer than 4096 chars
are truncated. After 4095 characters, input processing (e.g.,
ISIG and ECHO* processing) continues, but any input data after
4095 characters up to (but not including) any terminating
newline is discarded. This ensures that the terminal can
always receive more input until at least one line can be read.
Note that last item above.
Awesome; thank you Mark.
So possibly this limit can be removed, in my Org/Geiser context, by
evaluating (system* "stty" "-icanon") when initializing the Geiser-Guile
connection. I'll try that. Will the terminal that that 'stty' sees be
the same as Guile's stdin?
Jao, if that works, I wonder if it should be the default for Geiser? It
appears to me that Geiser shouldn't ever need the features of canonical
mode. Is that right?
Anyway, I'll see first if the stty call is effective.
Yes, with this in my ~/.guile-geiser -

(system* "stty" "-icanon")

- I can do evaluations past the 4K line length limit, and the Org-driven
problem that I first reported [1] has disappeared.

Thanks to Nicolas, Jao and Mark for your help in understanding this.

Neil

[1] https://lists.gnu.org/archive/html/emacs-orgmode/2018-11/msg00177.html
Jose A. Ortega Ruiz
2018-11-16 23:12:32 UTC
Permalink
Post by Neil Jerram
Post by Neil Jerram
Post by Mark H Weaver
This is a documented limitation in Linux's terminal handling when in
Canonical and noncanonical mode
The setting of the ICANON canon flag in c_lflag determines
whether the terminal is operating in canonical mode (ICANON set)
or noncanonical mode (ICANON unset). By default, ICANON is set.
[...]
Post by Mark H Weaver
* The maximum line length is 4096 chars (including the
terminating newline character); lines longer than 4096 chars
are truncated. After 4095 characters, input processing (e.g.,
ISIG and ECHO* processing) continues, but any input data after
4095 characters up to (but not including) any terminating
newline is discarded. This ensures that the terminal can
always receive more input until at least one line can be read.
Note that last item above.
Awesome; thank you Mark.
So possibly this limit can be removed, in my Org/Geiser context, by
evaluating (system* "stty" "-icanon") when initializing the Geiser-Guile
connection. I'll try that. Will the terminal that that 'stty' sees be
the same as Guile's stdin?
Jao, if that works, I wonder if it should be the default for Geiser? It
appears to me that Geiser shouldn't ever need the features of canonical
mode. Is that right?
Anyway, I'll see first if the stty call is effective.
Yes, with this in my ~/.guile-geiser -
(system* "stty" "-icanon")
- I can do evaluations past the 4K line length limit, and the Org-driven
problem that I first reported [1] has disappeared.
Ah, system* is a scheme call! So yeah, maybe we could add that call to
Geiser's guile initialization... i don't really see how that would cause
any problem elsewhere.
Post by Neil Jerram
Thanks to Nicolas, Jao and Mark for your help in understanding this.
And thanks to Nicolas, Mark and you for yours :)

Cheers,
jao
--
The vast majority of human beings dislike and even dread all notions with
which they are not familiar. Hence it comes about that at their first
appearance innovators have always been derided as fools and madmen.
-Aldous Huxley, novelist (1894-1963)
Jose A. Ortega Ruiz
2018-11-16 22:40:13 UTC
Permalink
Post by Neil Jerram
Post by Mark H Weaver
This is a documented limitation in Linux's terminal handling when in
Canonical and noncanonical mode
The setting of the ICANON canon flag in c_lflag determines
whether the terminal is operating in canonical mode (ICANON set)
or noncanonical mode (ICANON unset). By default, ICANON is set.
[...]
Post by Mark H Weaver
* The maximum line length is 4096 chars (including the
terminating newline character); lines longer than 4096 chars
are truncated. After 4095 characters, input processing (e.g.,
ISIG and ECHO* processing) continues, but any input data after
4095 characters up to (but not including) any terminating
newline is discarded. This ensures that the terminal can
always receive more input until at least one line can be read.
Note that last item above.
Awesome; thank you Mark.
So possibly this limit can be removed, in my Org/Geiser context, by
evaluating (system* "stty" "-icanon") when initializing the Geiser-Guile
connection. I'll try that. Will the terminal that that 'stty' sees be
the same as Guile's stdin?
Jao, if that works, I wonder if it should be the default for Geiser? It
appears to me that Geiser shouldn't ever need the features of canonical
mode. Is that right?
I don't really know offhand. Geiser simply uses comint-mode to talk to
Guile, and that in turn must be using Emacs' ability to spawn a process
and redirect its stdout and stderr, so I am not sure where the stty
kernel side enters the game, and how exactly shuold that call to system*
be performed to make sure it only affects the guile-emacs
communications.

Geiser has a mode of operation whereby it connects to a running Guile
REPL server instead of spawning its own process. In that mode, instead
of a stdout/err redirection what is used is a TCP/IP connection, that
won't have any of this limitations. So a cleaner solution would be to
make geiser always use a REPL server for Guile, but that requires some
non-trivial work on my side. Another option would be for the org mode
package to setup a guile server and then use connect-to-guile (instead
of run-guile), but i don't know how difficult that would be.

Finally, a shabby workaround would be generating multiple lines instead
of a big one :) That's of course not a real solution, but maybe can
work as a stopgap.
Post by Neil Jerram
Anyway, I'll see first if the stty call is effective.
Excellent. Thanks for taking the time and please keep us posted!

Cheers,
jao
--
"I don't want to achieve immortality through my work... I want to
achieve it through not dying" -- Woody Allen
Loading...