summaryrefslogtreecommitdiff
path: root/doc/doc-misc/RFC.conform
blob: d8aceb6e797485a75b48f597030b85db64ee7c11 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
Conformance with RFCs
---------------------

Exim is written to follow the rules laid down in the RFCs. However, there are
some circumstances where it either extends what is specified, or chooses not to
follow them strictly, for various reasons. Sometimes variations are controlled
by an option, which may default on or off. This document lists the variations
from the latest email RFCs, and discusses their background and implications.

Last Updated: 25 January 1999


1. RFC 822
----------

The original specification of the format of Internet mail messages is RFC 822,
later clarified and modified by RFC 1123. At the time of writing (January 1999)
a new RFC (currently known as draft-ietf-drums-msg-fmt-07) which updates and
consolidates all the material related to the message format is at a late stage
of drafting, and is expected to become an Internet Standard in due course.

The following is (I hope) a complete list of major variations from the draft
RFC. References in square brackets are to the -07 draft.


1.1 Line termination [2.1, 2.3]
-------------------------------

[Lines are terminated by CRLF; isolated CR and LF are not permitted.]

The CRLF requirement has to be interpreted carefully, because the RFC also says
that it does not cover the internal format "used by sites". Exim keeps messages
on its spool in Unix format, using only LF as the line terminator, and also
does local deliveries using only LF. I believe this is compliant with the RFC,
as these are both "internal formats".

Messages sent out by SMTP have CRLF line terminators. However, isolated CR
characters are treated as any other data characters, because Exim is eight-bit
clean (see 1.2 below).

See 2.1 below for a discussion of line terminators in incoming messages.


1.2 Eight-bit characters [2.1]
------------------------------

[Messages consist of 7-bit characters.]

Exim is eight-bit clean. It does not do any processing of the characters in the
body of a message.


1.3 Maximum line length [2.1, 2.3]
----------------------------------

[The maximum length of a line is 998 characters.]

Exim does not enforce any limit on line length.


1.4 The "phrase" part of an address [3.4]
-----------------------------------------

[The phrase is a sequence of "words"; a word is an "atom" or a quoted string.]

The characters that can be used in an "atom" do not include the full stop
(dot, period). Thus a header line such as

  To: John Q. Public <jqp@anywhere.org>

is syntactically invalid under a strict interpretation of the RFC because the
dot in the phrase part is not quoted. However, many MTAs do not enforce this
restriction, so Exim was changed to be relaxed about it as well. In fact, the
draft RFC is moving towards allowing this. In section [4.1], which is defining
"obsolete" syntax that programs must accept (but not generate), it says this:

  The period character is added to obs-phrase.

  Note: The period character in obs-phrase is not a form that was allowed
  in earlier versions of this or any other standard. Period (nor any other
  character from specials) was not allowed in phrase because it introduced
  a parsing difficulty distinguishing between phrases and portions of an
  addr-spec (see section 4.4). It appears here because the period
  character is currently used in many messages in the display-name portion
  of addresses, especially for initials in names, and therefore must be
  interpreted properly. In the future, period may appear in the regular
  syntax of phrase.


1.5 Source routed addresses [4.4]
---------------------------------

[Source routed addresses are always enclosed in <>.]

Source routed addresses are declared obsolete in the draft RFC, but MTAs are
still required to handle them. Strictly, a source-routed address must be
enclosed in <> characters, so a header such as

  From: @a,@b:c@d

is syntactally invalid. Exim does not enforce this restriction.


1.6 Local parts [3.4.1]
-----------------------

[Dots in unquoted local parts may not be consecutive or at either end.]

Exim allows unquoted local parts to begin or end with a dot (period, full
stop), and it also permits two consecutive dots in a local part.



2. RFC 821
----------

The original specification of SMTP is RFC 821, later clarified and modified by
RFC 1123. Domain name system requirements and their implications for mail are
covered in RFCs 1035 and 974. A scheme for extending the SMTP protocol is
described in RFC 1869, and there are subsequent RFCs specifying particular
extensions.

At the time of writing (January 1999) a new RFC (currently known as
draft-ietf-drums-smtpupd-09) which updates and consolidates all the material
connected with SMTP message transmission is at a late stage of drafting, and is
expected to become an Internet Standard in due course.

The new draft is written using the terms MUST, SHOULD, and MAY, which, when
written in capital letters, have precise meanings. To quote from the draft:

  "MUST" or "MUST NOT" identify absolute requirements for conformance to
  this specification. Implementations that do not conform to them lie
  outside the scope of this specification and often will not
  interoperate properly with SMTP implementations that do conform.
  Implementations that are fully conforming also adhere to all "SHOULD"
  and "SHOULD NOT" requirements. Implementations that adhere to all
  "MUST" ("MUST NOT") but not to all of these are considered to be
  partially conforming. Such implementations may interoperate properly
  with fully conforming ones and with each other, but this will
  typically be the case only if great care is taken. Consequently, an
  implementation should violate "SHOULD" ("SHOULD NOT") requirements
  only under exceptional and well-understood circumstances.

The implementation of Exim is intended to conform to the spirit of this
paragraph. The following is (I hope) a complete list of major variations
from the draft RFC. In addition to the items listed here, there are other minor
extensions such as the tolerance of white space in places where it is not
strictly permitted by the RFC. References in square brackets are to the -09
draft sections, and brief summaries of the RFC requirement are also given in
square brackets.


2.1 Line termination [2.3.7, 4.1.1.4]
-------------------------------------

[SMTP lines are terminated by CRLF.]

Exim recognizes LF without CR as a line terminator in all forms of input. For
SMTP input, any preceding CR is discarded. An early version of Exim followed
the RFC strictly, and did not recognize LF without CR in SMTP input. However,
it seems that sites on the net send out messages with just LF terminators,
despite the warnings in the RFCs, and other MTAs handle this, so Exim was
changed. However, there is a compile time macro called STRICT_CRLF which can be
set to restore the strict behaviour, though this is undocumented.


2.2 Eight-bit characters [2.4.1]
--------------------------------

[SMTP transmits only 7-bit characters.]

Exim is eight-bit clean, and makes no attempt to modify the data in a message
in any way. In particular, for messages containing characters with the top bit
set, it neither tries to negotiate 8-bit transmission, nor converts such
characters into an encoded form. In other words, it adopts the "just send 8"
strategy. It can be configured to send out 8BITMIME in its response to EHLO
(which it does not do by default), and it recognizes the 8BITMIME keyword on
incoming messages, but neither of these affect its handling of message data.
"Just send 8" is the strategy of a number of MTAs; it is argued that it
achieves what the user wants more often than other strategies.


2.3 Use of EHLO/HELO [3.2]
--------------------------

[Client MTAs should always start with EHLO, not HELO.]

Exim sends EHLO only when it finds the string "ESMTP" in an SMTP greeting
message. If EHLO is refused with a 5xx return code, it then reverts to HELO as
required, but it does not contain logic for converting to HELO on other errors
such as loss of connection or timeout after EHLO. That is one reason why it
doesn't always send EHLO; there are reported to be ancient SMTP servers out
there which collapse on receiving EHLO. (There is also at least one server
whose banner reads "<host name> ignores ESMTP", but it is RFC 821 compliant in
that it responds with 5O0 to EHLO, so Exim successfully reverts to HELO.)


2.4 Closing the connection [4.1.1.10]
-------------------------------------

[Client must wait for response to QUIT before closing the connection.]

Exim closes the connection immediately after sending QUIT, without waiting for
the reply. There was a lot of discussion about this on one of the mailing
lists. The conclusion was that this behaviour is fine on Unix systems, which
have TCP/IP implementations that close down the underlying channel tidily even
when the associated process has terminated. Indeed, not waiting may be
beneficial, as it moves the TIME_WAIT state (waiting to ensure there's no more
data in transit) from the server to the client system. On some other operating
systems (I understand) it is a disaster to terminate the sending process
without waiting for the QUIT response, because all the data about the
connection lives in the client's process space, and is therefore thrown away
before the response arrives. The subsequent arrival of the response then causes
bad behaviour.


2.5 IPv6 address literals [4.1.2]
---------------------------------

[IPv6 address literals are introduced by "IPv6".]

Exim recognizes IPv6 literals as just the colon-separated hexadecimal form of
an IPv6 address, for example 1080:0:0:0:8:800:200C:417A, without the need for a
prefix. At present, it does not even recognize the prefix. When IPv6 becomes
more widespread, Exim will follow whatever the common usage is.


2.6 Underscores in domain names [4.1.2]
---------------------------------------

[Underscores are not legal in domain names.]

RFC 822 allows all characters except specials, space, and controls in domain
names, but the SMTP RFCs are stricter, allowing only letters, digits, and
hyphen. Exim is compliant when checking incoming addresses in SMTP commands,
but it is more relaxed by default when checking domain names that are supplied
by EHLO or HELO commands, because many client workstations get set up with
underscores in their names. There is an option that can be set to cause Exim to
refuse underscores. (There are also options to specify certain hosts from which
it will accept any old junk after EHLO or HELO. Such is the woeful state of
some SMTP clients.)


2.7 Removal of return-path headers [4.4]
----------------------------------------

[Relaying MTAs should not remove return-path.]

Exim removes Return-Path: headers from all messages, if return_path_remove is
set (the default). It does not attempt to determine if it is being a relay or
not. Indeed, for some messages it might be both a relay and a final destination
MTA for the same message.


2.8 Randomizing the order of addresses of multihomed hosts [5]
--------------------------------------------------------------

[Multihomed host addresses should not be randomized.]

Exim does randomize a list of several addresses for a single host, because
caching in resolvers will defeat the round-robinning that many namerservers
use. (Note: this is not the same as randomizing equal-valued MX records. That
is required by the RFC.)


2.9 Handling "MX points to self" [5]
------------------------------------

[MX points to self must be treated as an error.]

The RFC doesn't allow for the possibility of special-purpose routing in the
case when the lowest numbered MX record points to the local host. The default
Exim configuration is compliant, but it is possible to configure Exim to behave
differently, and there are several situations where this can be useful.


2.10 Source routing [6.1]
-------------------------

[Source routes should be stripped.]

The new RFC has moved forward in deprecating source-routed email addresses.
Exim does not strip them down by default, but can be made to do so by setting
collapse_source_routes. However, even when it is not stripping them down, it
does not add host routing to reverse-paths when processing a source-routed
forward-path.


2.11 Loop detection [6.2]
-------------------------

[Loop count for Received: headers should be at least 100.]

Exim's default setting of the received_headers_max option is 30. Most messages
these days seem to accumulate less than half a dozen Received: headers, and
even a couple of forwardings don't bring this anywhere near 30.


2.12 Addition of missing headers [6.3]
--------------------------------------

[Missing headers may be added, and domains qualified, only if client is
identified.]

Exim always adds Message-Id: and Date: headers if these are missing, whatever
the source of the message, and likewise when it expands non-fully-qualified
domains, it does so independently of the message's source.


2.13 Syntax of MAIL and RCPT commands [4.1.1.2, 4.1.1.3]
--------------------------------------------------------

Exim is more relaxed than the RFC requires:

(1) Trailing white space is ignored.

(2) It permits white space after the "FROM" and "TO" keywords.

(3) It does not insist on the address being enclosed in <> characters. In fact,
    it recognizes addresses in RFC 822 format here, except that domain
    components are restricted to containing only letters, digits, and hyphens.

(4) Local parts are permitted to contain null components, that is, may start or
    end with an unquoted full stop (period) or contain two consecutive
    unquoted full stops.


2.14 Non-fully-qualified domains [2.3.5]
----------------------------------------

[All domains must be fully qualified.]

A domain that is not fully qualified has some of its trailing components
missing, and is normally a local alias of some sort, for example, just a
single-component host name.

Exim can be configured to "widen" non-fully-qualified domains, either by using
the facilities of the DNS resolver, or by an explicit list of widening strings.
When this is done, it applies to addresses received by SMTP from other hosts,
as well as to locally-originated addresses. Address re-writing could also be
used for this purpose.


2.15 Unqualified addresses [4.1.2]
----------------------------------

[Addresses in SMTP commands must include domains.]

An unqualified address consists of a local part without a domain. Do not
confuse "qualified address" and "qualified domain". A qualified address may
include a non-fully-qualified domain.

There is one exception to the RFC rule: it is required that the unqualified
address "<postmaster>" always be accepted. Apart from this, Exim rejects
domainless addresses in SMTP commands by default, but it can be configured with
a list of hosts and/or networks that are permitted to send addresses without
domains in SMTP commands. Any such address that is accepted (including
<postmaster>) is qualified by adding the value of the qualify_domain option.


2.16 VRFY and EXPN [3.5.1, 3.5.2, 3.5.3, 7.3]
---------------------------------------------

[VRFY and EXPN should be supported.]

Exim does not support VRFY and EXPN by default, but a list of hosts and
networks for which they are permitted can be given.


2.17 Checking of EHLO/HELO commands [4.1.4]
-------------------------------------------

[Client must send EHLO. Server must not refuse message if EHLO/HELO check
fails.]

Exim, as a client, always sends EHLO or HELO (see 2.3 above). As a server, it
does not insist on there having been a valid EHLO or HELO command before the
start of a message transaction. Any EHLO or HELO command that is received is
rejected only if it contains a syntax error. That is, it is never rejected on
the basis of any validation checking that may be performed on the data it
contains.

However, Exim can be configured to insist that (a) there is valid EHLO/HELO
command before any message transaction and (b) the domain in that command
matches the domain obtained by looking up the IP address of the sending host.
It is possible to specify exception lists of hosts and/or networks for which
this check does not apply.


2.18 Format of delivery error messages [3.7]
--------------------------------------------

[Standard report formats should be used if possible.]

Exim's delivery failure reports do not conform to the format described in RFC
1894.


## End ##