umouse

umouse.git
git clone git://git.lenczewski.org/umouse.git
Log | Files | Refs | Submodules | README

A0006.HTM (16288B)


      1 <title>ANS Forth RFI 0006: Writing to Input Buffers</title>
      2 
      3 This document is produced by TC X3J14 as its clarification of questions
      4 raised about ANSI X3.215-1994, American National Standard for Information
      5 Systems - Programming Languages - Forth.
      6 
      7 <p>
      8 The questions covered herein were raised by query
      9 
     10 <p>
     11 Q0006, regarding Writing to Input Buffers.
     12 
     13 <p>There are four parts in this document:
     14 
     15 <ol>
     16 <li><a href="#query">The original question as received.</a>
     17 <li><a href="#reply">The TC's reply.</a>
     18 <li><a href="#ballot">The Letter Ballot issued by TC Chair.</a>
     19 <li><a href="#results">TC Chair's statement of ballot results.</a>
     20 </ol>
     21 
     22 <h2><a name="query">Q0006 as received</a></h2>
     23 
     24 <pre>
     25 Subject: Q0006 Request Recognized
     26 To: X3J14@minerva.com
     27 Date: Thu, 05 Oct 95 09:23:38 PDT  
     28 </pre>
     29 
     30 The following query has been assigned number Q0005.
     31    - Greg Bailey, by direction  950913 0159Z
     32 <hr>
     33 
     34 Request from TC member Jonah Thomas (Jet.Thomas@minerva.com)
     35 
     36 <p>
     37 Here's a question about the Standard:
     38 
     39 <p>
     40 When is it allowed to write into input buffers?
     41 
     42 <p>
     43 On the surface this is simple:  "A program shall not write into the
     44 input buffer."
     45 
     46 <p>
     47 But what about this:
     48 
     49 <pre>
     50 {
     51 <a href=dpans6.htm#6.1.0450>:</a> CHANGE-TEXT <a href=dpans6.htm#6.1.2165>S"</a>  <a href=dpans15.htm#15.6.1.2465>WORDS</a> <a href=dpans15.htm#15.6.1.0220>.S</a> " <a href=dpans6.htm#6.1.2165>S"</a> <a href=dpans6.htm#6.1.2160>ROT</a> <a href=dpans6.htm#6.1.2260>SWAP</a> MOVE" <a href=dpans6.htm#6.1.1360>EVALUATE</a> <a href=dpans6.htm#6.1.0460>;</a>
     52 
     53  <a href=dpans15.htm#15.6.1.0220>.S</a> <a href=dpans15.htm#15.6.1.0220>.S</a> <a href=dpans15.htm#15.6.1.0220>.S</a> <a href=dpans15.htm#15.6.1.0220>.S</a> <a href=dpans15.htm#15.6.1.0220>.S</a> <a href=dpans6.htm#6.1.2216>SOURCE</a> <a href=dpans6.htm#6.1.1260>DROP</a> CHANGE-TEXT <a href=dpans15.htm#15.6.1.0220>.S</a>
     54 }
     55 </pre>
     56 
     57 Our input buffer starts out with
     58 
     59 <pre>
     60  <a href=dpans15.htm#15.6.1.0220>.S</a> <a href=dpans15.htm#15.6.1.0220>.S</a> <a href=dpans15.htm#15.6.1.0220>.S</a> <a href=dpans15.htm#15.6.1.0220>.S</a> <a href=dpans15.htm#15.6.1.0220>.S</a> <a href=dpans6.htm#6.1.2216>SOURCE</a> <a href=dpans6.htm#6.1.1260>DROP</a> CHANGE-TEXT <a href=dpans15.htm#15.6.1.0220>.S</a>
     61 </pre>
     62 
     63 When it executes CHANGE-TEXT it has ( ca ) the buffer address on the
     64 stack.  CHANGE-TEXT first provides a string ( ca ca' u ) then another
     65 string ( ca ca' u ca" u") and then evaluates the second string.  The
     66 original buffer is now not the input buffer, now the 2nd string is
     67 the read-only input buffer.  It does <a href=dpans6.htm#6.1.2160>ROT</a> ( ca' u ca ) 
     68 <a href=dpans6.htm#6.1.2260>SWAP</a> ( ca' ca u)
     69 <a href=dpans6.htm#6.1.1900>MOVE</a> and moves the first string into 
     70 the original input buffer, which now
     71 reads
     72 
     73 <pre>
     74  <a href=dpans15.htm#15.6.1.2465>WORDS</a> <a href=dpans15.htm#15.6.1.0220>.S</a> <a href=dpans15.htm#15.6.1.0220>.S</a> <a href=dpans15.htm#15.6.1.0220>.S</a> <a href=dpans15.htm#15.6.1.0220>.S</a> <a href=dpans6.htm#6.1.2216>SOURCE</a> <a href=dpans6.htm#6.1.1260>DROP</a> CHANGE-TEXT <a href=dpans15.htm#15.6.1.0220>.S</a>
     75 </pre>
     76 
     77 Then it reverts to the changed input buffer and does the final 
     78 <a href=dpans15.htm#15.6.1.0220>.S</a> .
     79 
     80 <p>
     81 I say that even if this is legal, it's bad practice.  If you want to 
     82 alter an input buffer, better to copy it into your own private area and
     83 change it however you like and then do <a href=dpans6.htm#6.1.1360>EVALUATE</a> .
     84 
     85 <p>
     86 My question is whether this _is_ legal, and why or why not.
     87 
     88 <p>
     89 I originally interpreted a previous dpANS document as saying that the
     90 user input buffer, the one pointed to by the obsolescent 
     91 <a href=dpans6.htm#6.2.2290>TIB</a> and <a href=dpans6.htm#6.2.0060>#TIB</a> ,
     92 is read-only.  That makes a different kind of sense to me.  In some
     93 systems it could be a read-only pipe from some other process, or it could
     94 be some special hardware read-only buffer.  It makes sense to me that the
     95 user input buffer should never be modified by a standard program.
     96 
     97 <p>
     98 So I can see 4 obvious choices, with perhaps others also available:
     99 
    100 <ol>
    101 
    102 <li>The current input buffer is read-only; strings which are not the
    103 current input buffer may be written to unless they're read-only for some
    104 other reason.
    105 
    106 <li>The current input buffer is read-only as is the user input buffer;
    107 any region of allotted memory is read/write except while it is the input
    108 buffer.
    109 
    110 <li>Any region of memory that is EVALUATEd is forever after read-only.
    111 
    112 <li>Any region of memory that is EVALUATEd is read-only until the
    113 evaluation is complete -- either by evaluation until the parse area is
    114 empty and return from <a href=dpans6.htm#6.1.1360>EVALUATE</a> or 
    115 <a href=dpans6.htm#6.1.1380>EXIT</a> found in the evaluated string, or
    116 <a href=dpans6.htm#6.1.2050>QUIT</a> or 
    117 <a href=dpans6.htm#6.1.0670>ABORT</a> performed before evaluation is 
    118 finished, or if an ambiguous
    119 condition exists due to the code evaluated.
    120 
    121 </ol>
    122 
    123 I don't like #3, since I sometimes want to evaluate text that's in block
    124 buffers or file buffers.  #4 could take a lot of figuring to tell whether
    125 it's being violated or not.  I like #1, but it seems to allow some
    126 atrocious practices in standard code.  #2 has the further problem that
    127 there is no way for a program to tell whether an address is in the 
    128 read-only user input buffer, except with 
    129 <a href=dpans6.htm#6.2.2218>SOURCE-ID</a> or 
    130 <a href=dpans6.htm#6.2.2290>TIB</a> , both of which 
    131 are in an optional extension wordset.
    132 
    133 <p>
    134 Any ideas?
    135 
    136 <h2><a name="reply">TC Reply to Q0006</a></h2>
    137 
    138 <pre>
    139 From: Elizabeth Rather
    140 Subject: Q0006R, Writing to Input Buffers, Official Response
    141 To: X3J14 Technical Committee
    142 Cc: lbarra@itic.nw.dc.us
    143 Date: Mon, 19 Feb 96 14:09
    144 
    145 
    146 Doc#:  X3J14/Q0006R
    147 Reference Doc#:  X3.215.1994 ANS Forth
    148 Date:  February 19, 1996
    149 Title: Response to Request for Interpretation Q0006, Writing to Input Buffers
    150 </pre>
    151 
    152 <h3>Q0006:  When is it allowed to write into input buffers?</h3>
    153 
    154 Request from TC member Jonah Thomas (Jet.Thomas@minerva.com)
    155 
    156 <i><blockquote>
    157 Here's a question about the Standard:
    158 
    159 When is it allowed to write into input buffers?
    160 </blockquote></i>
    161 
    162 When a standard program receives the address and length of the
    163 'input buffer' from the word  <a href=dpans6.htm#6.1.2216>SOURCE</a>  
    164 it *must* treat this information as though it describes a read-only region of memory.
    165 
    166 <p>
    167 This is one of the conditions that must be met in order to compose
    168 code whose processing of 'input buffers' is independent of the
    169 physical and logical characteristics of the diverse 'input sources'
    170 described in the Standard.
    171 
    172 <p>
    173 It is logically possible for an application to alter the 'input
    174 buffer' supplied to  <a href=dpans6.htm#6.1.1360>EVALUATE</a>  
    175 while that 'input buffer' is being
    176 processed.  This possibility exists as a special case because the
    177 application actually "owns" and has full control over the buffer
    178 in question.  As owner of the buffer, and both producer and 
    179 consumer of the data it contains, the application can alter this
    180 buffer and manipulate  <a href=dpans6.htm#6.1.0560>>IN</a>  with 
    181 deterministic results, assuming
    182 that it knows the buffer to exist in physically writable memory.
    183 
    184 <p>
    185 The Standard does not describe this possibility because the
    186 implied coding methods constitute a special case which is only
    187 usable with  <a href=dpans6.htm#6.1.1360>EVALUATE</a>  and can 
    188 lead to the development of source
    189 language syntax which is not processable from other 'input sources.'
    190 
    191 <h3>Relevant text from the Standard:</h3>
    192 
    193 <dl>
    194 
    195 <dt>
    196 <a href=dpans6.htm#6.1.1360>6.1.1360</a>  <a href=dpans6.htm#6.1.1360>EVALUATE</a>
    197 <dd>
    198     ... Make the string described ... both the
    199         'input source' and 'input buffer' ...
    200 
    201 <dt>
    202 <a href=dpans6.htm#6.1.2216>6.1.2216</a>  <a href=dpans6.htm#6.1.2216>SOURCE</a>
    203 <dd>
    204     ... c-addr is the address of, and u is the number of
    205     characters in, the 'input buffer'.
    206 
    207 <dt>
    208 <a href=dpans2.htm#2.1>2.1</a> Definitions of Terms
    209 <dd>
    210  <dl>
    211     <dt>input source: <dd> The device, file, block, or other entity
    212         that supplies characters to refill the 'input buffer'.
    213     <dt>input buffer: <dd> A region of memory containing the sequence
    214         of characters from the input source that is currently
    215         accessible to a program.
    216  </dl>
    217 
    218 <dt>
    219 <a href=dpans3.htm#3.3.3.5>3.3.3.5</a>  Input buffers
    220 <dd>
    221     The address, length, and content of the 'input buffer' may be
    222         transient.  A program shall not write into the 'input
    223         buffer'.  ... the 'input buffer' is either ... or a buffer
    224         specified by  <a href=dpans6.htm#6.1.1360>EVALUATE</a> .  ...  An ambiguous condition
    225         exists if a program modifies the contents of the 'input
    226         buffer'."
    227 
    228 </dl>
    229 
    230 <h3>Discussion of Technical Committee Intent</h3>
    231 
    232 <a href=dpans6.htm#6.2.2040>QUERY</a> and <a href=dpans6.htm#6.2.2290>TIB</a> 
    233 have been deprecated because while some systems have
    234 a discrete *place* called <a href=dpans6.htm#6.2.2290>TIB</a> that is 
    235 exclusively used for storing
    236 the results of <a href=dpans6.htm#6.2.2040>QUERY</a>, other existing systems 
    237 actually use <a href=dpans6.htm#6.2.2290>TIB</a> as a
    238 handle for the current line being interpreted from a file.  While
    239 applications exist making either assumption, these applications can
    240 obviously not work properly on both sorts of systems in general.
    241 
    242 <p>Storing into 'input buffers' is disallowed because we permit input
    243 sources to nest indefinitely and it is not practical for systems
    244 that conserve resources to guarantee unique concurrent addressability
    245 of all nested input sources, nor is it practical to create separate
    246 save areas for all current input buffers just in case someone stored
    247 into one of them.  The TC specifically intends that, when input is
    248 coming from refreshable sources, implementations may refresh their
    249 buffers on un-nesting to conserve resources, and that when logically
    250 possible implementations may use transient, shared buffers (as is
    251 common practice with  <a href=dpans7.htm#7.6.1.1790>LOAD</a>  on 
    252 multiprogrammed systems.)  Therefore,
    253 the results of storing into input buffers is stated as ambiguous,
    254 and may even be physically disallowed, as in the case of interpreting
    255 source from read only memory mapped files in some operating systems.
    256 
    257 <p>
    258 For similar reasons the address returned by  
    259 <a href=dpans6.htm#6.1.2216>SOURCE</a>  is transient,
    260 and there is specifically no guarantee that any 'input buffer' other
    261 than *the* (current) 'input buffer' is addressable at any time, nor
    262 that this address be valid after nesting or un-nesting.  (Indeed,
    263 in classical multiprogrammed systems the address returned by
    264 <a href=dpans6.htm#6.1.2216>SOURCE</a> is no longer valid after using 
    265 <a href=dpans6.htm#6.1.2450>WORD</a> .)
    266 
    267 <p>
    268 The TC expects all Systems to process buffers provided by  
    269 <a href=dpans6.htm#6.1.1360>EVALUATE</a>
    270 in place.  This is logically necessary, in our view, since there are
    271 no upper limits on the lengths of these buffers.  Since it is 
    272 semantically permissible to describe more than half of addressable memory
    273 in an <a href=dpans6.htm#6.1.1360>EVALUATE</a>  string it is not in 
    274 general *possible* to copy such a
    275 string elsewhere and address it consistently with the definition
    276 of  <a href=dpans6.htm#6.1.2216>SOURCE</a> .
    277 
    278 <p>
    279 Systems are not allowed to alter the contents of 'input buffers' 
    280 provided through  <a href=dpans6.htm#6.1.1360>EVALUATE</a> .  At the 
    281 same time, applications are responsible for guaranteeing that the 
    282 buffers they provide to  <a href=dpans6.htm#6.1.1360>EVALUATE</a>  are
    283 static in both address and content until the  
    284 <a href=dpans6.htm#6.1.1360>EVALUATE</a>  has completed.
    285 
    286 <p>
    287 <a href=dpans6.htm#6.1.1360>EVALUATE</a> is a special case since the 
    288 input buffer is literally
    289 the area of memory provided to <a href=dpans6.htm#6.1.1360>EVALUATE</a> and 
    290 is as such static.
    291 The application "owns" the memory it occupies, and the mechanism
    292 for messaging the interpreter via <a href=dpans6.htm#6.1.0560>>IN</a> 
    293 (as well as the syntax of
    294 Forth) implies no prefetching or preprocessing of input buffers
    295 is necessary or appropriate.
    296 
    297 <p>
    298 Given these conditions, it *is* deterministic for an application to
    299 store (with great care) into <a href=dpans6.htm#6.1.1360>EVALUATE</a> buffers 
    300 that it knows to be
    301 active, although such methods pertain exclusively to 
    302 <a href=dpans6.htm#6.1.1360>EVALUATE</a> and
    303 certainly not to any other input stream source.
    304 
    305 <p>
    306 However, as the Standard is written, any program that does this is
    307 not a standard program.
    308 
    309 <hr>
    310 
    311 It has been brought to the TC's attention that there exists at least
    312 one implementation which moves an <a href=dpans6.htm#6.1.1360>EVALUATE</a> 
    313 string to a separate work
    314 area for processing.  As a separate but related question we have
    315 been asked whether a system which does this, and which therefore
    316 fails John Hayes' test suite, is compliant.  (The test in question
    317 fails if the buffer address and length returned by 
    318 <a href=dpans6.htm#6.1.2216>SOURCE</a> occurring
    319 in an <a href=dpans6.htm#6.1.1360>EVALUATE</a> string differ from the 
    320 address and length of the
    321 argument provided to EVALUATE.)
    322 
    323 <p>
    324 The system in question is definitely not compliant since both the
    325 letter of the standard and the intent of the TC are that 
    326 <a href=dpans6.htm#6.1.1360>EVALUATE</a>
    327 strings be processed in place.  The fact that John's test suite is
    328 able to detect the deviation confirms its visibility to application
    329 code, assuming that any such code ever makes such a comparison.
    330 
    331 <p>
    332 The ambiguity documented for storing into 'input buffers' serves to
    333 mitigate the seriousness of this noncompliance since any program
    334 which stores into an active <a href=dpans6.htm#6.1.1360>EVALUATE</a> 
    335 string is environmentally dependent on the system's resolution of 
    336 the ambiguity.  Nevertheless, the
    337 Standard clearly states that the <a href=dpans6.htm#6.1.1360>EVALUATE</a> 
    338 string *is* the 'input
    339 buffer' (by definition a region of memory), and that 
    340 <a href=dpans6.htm#6.1.2216>SOURCE</a> returns
    341 the address and length of this region of memory.  This is an unqualified 
    342 promise to people writing standard programs, and any system
    343 which breaks that promise must document its nonstandard implementations 
    344 of <a href=dpans6.htm#6.1.1360>EVALUATE</a> and <a href=dpans6.htm#6.1.2216>SOURCE</a>.
    345 
    346 <h2><a name="ballot">Letter Ballot</a></h2>
    347 
    348 <pre>
    349 X3 Subgroup Letter Ballot
    350 Authorized by X3 Procedures - Distributed by X3 Subgroup X3J14
    351 Project: X3J14, ANS Forth
    352 Doc#:  X3J14/LB017
    353 Reference Doc#s:  X3J14/Q0006R, X3.215.1994 ANS Forth
    354 Date:  February 19, 1996
    355 Title:  Response to Request for Interpretation Q0006, Writing to Input Buffers
    356 Ballot Period:  30 Days
    357 Ballot Closes NOON DATE:  March 21, 1996
    358 Respond to:  greg@minerva.com
    359         or:  Elizabeth D. Rather, Chair
    360              FORTH, Inc.
    361              111 N. Sepulveda Blvd.  Suite 300
    362              Manhattan Beach, CA  90266
    363              (310) 372-8493    FAX (310) 318-7130
    364              erather@forth.com
    365 
    366 Statement:
    367     Document X3J14/Q0006R contains a proposed Response to Request for
    368     Interpretation Q0006.
    369 
    370 Question:
    371     Do you agree that this response represents the intended interpretation of
    372     X3.215.1994 ANS Forth?
    373 
    374 
    375 /------------------------  begin response area----------------------\
    376 |
    377 |  YES____ NO____ ABSTAIN____
    378 |
    379 |  Signature:  [not required for email ballots]
    380 |  Name:
    381 |  Organization:
    382 |
    383 |  Explanation (REQUIRED for NO or ABSTAIN votes):
    384 |    <none>
    385 \------------------------  end response area  ----------------------/
    386 
    387 INSTRUCTIONS:
    388 Please return the entire letter ballot with your response _either_ by email
    389 to greg@minerva.com _or_ by regular mail or fax or email to me at the above
    390 address, before the closing date & time.
    391 
    392    If replying electronically PLEASE edit only within the response area
    393    indicated above, inserting any explanatory text in place of <none>.
    394    Any changes made outside that area will likely be overlooked.
    395 
    396 All TC members must vote.  Failure to vote in two consecutive ballots may
    397 cause you to lose your voting rights in X3J14.
    398 
    399 Thank you for your participation.
    400 
    401 Elizabeth D. Rather, Chair, X3J14
    402 </pre>
    403 
    404 <h2><a name="results">Results of Letter Ballot</a></h2>
    405 
    406 <pre>
    407 Letter ballot 17 closed at noon March 21 with the following results:
    408 
    409         Y  N  A NV
    410 LB17:  12, 0, 1, 1
    411 
    412 Abstention from John Hayes.
    413 </pre>