on modern systems, this makes it likely to end up on tmpfs, which is a
lot faster and ssd-friendlier.
the symlink is not deleted at the end, to minimize fs churn. that means
it will be dangling after a reboot, which gets fixed in the next run.
the operator was exactly inverted. that means that it actually wouldn't
compile with both older versions (that needed the aliases) and
potentially new versions (that will hide the data members - still not
the case as of 3.2).
amends 8a40554f0.
we failed to reset the box list pointer after freeing it, which would
lead to a crash.
we also failed to reset the listing status, which would lead to
malfunction if we hadn't already crashed.
this inlines imap_cleanup_store(), as there isn't much value in keeping
it. the message list is already freed when disowning the store anyway.
we need to deep-copy the struct hostent data, as otherwise the
concurrent connects will overwrite each other's lookup results.
this is a rather hypothetical fix, as the bug currently affects only
channels connecting two IMAP accounts, and only if the first host's
first address asynchronously fails to connect.
this is relevant only when listing an IMAP Store's contents, as that's
the only place where we aren't imposing the spelling ourselves.
we need to be careful not to treat our own canonical (prefix-stripped
and always slash-delimited) box names like that; codify that in
comments.
this reveals that commit 6f2160f1 may be deemed to have been incorrect -
the TODO item was ambiguous, and could quite possibly have meant this
fix. unsurprisingly, 380ccdd4 re-introduced it with more explicit
wording.
the query is untypical enough to have caused problems with davmail (when
we still used *:*) and mailo.com (until it got fixed), so better check
that the result (not) returned by the server makes sense.
we did already set up the timeout when starting to send commands, but so
far we did not reset it when succeeding to send out data. rectify that.
REFFAIL: 87sgy92we3.fsf@jnanam.net
strptime(3)'s "%d" day of the month conversion specifier does not accept
leading blanks in case of single digit numbers. "%e" does that.
While implementation details and differences between the two
day-of-month conversion specifiers vary, none of the major libcs
(incl. OpenBSD, FreeBSD, Illumos, musl) consume a leading blank for "%d"
except glibc, which consumes any number of spaces like in the "%e" case.
Using "%e" ensures that date strings like " 4-Mar-2018 16:49:25 -0500"
are successfully parsed by all major implementations in compliance to
X/Open Portability Guide Issue 4, Version 2 ("XPG4.2"). musl is now the
only one that still treats "%d" and "%e" without stripping any space.
Issue analysed and reported by Evan Silberman <evan@jklol.net> who found
mbsync 1.3.0 on OpenBSD 6.4 to fail with `CopyArrivalDate' set when
syncing mails with the above mentioned timestamp.
See https://marc.info/?l=openbsd-tech&m=155044284526535 for details.
Ater sasl_client_step() is called and the Cyrus SASL library forwards
it to the client plugin, if the result value is OK (authentication
succeeded), the clientout is filled out to be an empty string, even if
the client plugin wanted to return NULL.
To avoid that mbsync complains at this point, check the returned length
instead of the pointer.
turns out that some IMAP servers (e.g., poczta.o2.pl) do not return
messages in ascending UID order in response to a UID FETCH request,
which makes the driver violate the API contract.
counter this by sorting the messages. this also addresses the
long-standing (but hypothetical) issue that parallel UID FETCH requests
could be handled out-of-order and thus also lead to mixed up results.
based on patch by Marcin Niestroj <macius1990w@gmail.com>.
while only 1KiB is required by the IMAP spec, AUTHENTICATE GSSAPI with
Kerberos requires about 1700 bytes.
accomodate that, plus some reserve.
fix suggested by Tollef Fog Heen <tfheen@err.no> via Debian BTS.
maildir supports a 'P' flag which denotes the fact that a message has
been 'passed' on (forwarded, bounced). notmuch syncs this to the
'passed' tag.
Per https://tools.ietf.org/html/rfc5788, IMAP has a user-defined flag
(keyword) '$Forwarded' that is supported by many servers and clients
these days. (Technically, one should check for '$Forwarded' in the
server response.)
Restructure mbsync's flag parser to accept keywords (flags starting with
'$') but still bail out on unknown system flags (flags starting with '\').
Support '$Forwarded' as a first keyword since it maps to maildir's 'P'
and needs to be sorted in between the system flags.
Signed-off-by: Michael J Gruber <github@grubix.eu>
Mailbox driver flags are defined in several places. It is essential that
they are kept in sync, so mark them with the same string for easy
grepping with an alerting boiler plate.
Signed-off-by: Michael J Gruber <github@grubix.eu>
re-using the file name buffer for the headers wasn't such a great idea,
as _POSIX_PATH_MAX is only 256, while RFC2822 permits lines up to 1000
chars. and sure enough, i have a message with a whopping 470-char
message-id header ...
empty strings were previously meaningless, and starting with 72c2d695a,
failure to handle them lead to bogus results when the IMAP hierarchy
separator is legitimately empty (when the server genuinely supports none
and none is manually configured). non-null can be asserted more cleanly
than null-or-non-empty, so change the api like that.
incidentally, this also removes the need to work around gcc's bogus
warning in -Os mode.
problem found by "Casper Ti. Vector" <caspervector@gmail.com>
apple gcc 4.2 complains about the use of the pragma inside a function.
clang also complains, but because the pragma is entirely unknown to it.
as neither compiler emits the bogus warning in the first place, there is
no point in suppressing it anyway.
our current project structure precludes the clash between some indirect
include of ssl.h and our definition of 'S' (or 'M', i don't remember)
that happened on some system, so there is no need to avoid including it
any more.
this avoids complaints from some more picky compilers, as re-defining
typedefs (even to the same thing) is illegal before C11.
there is no reason not to, and debian even disabled 1.0 globally,
because it's (theoretically) too insecure in some contexts (BEAST
attack).
in the compat wrapper, the UseTLSv1 option has been re-interpreted as
v1.x, to avoid adding new options.
while that's just bad api, inflate() can return Z_BUF_ERROR during
normal operation.
contrary to the zpipe example and what the documentation implies,
deflate() actually isn't that braindead. add respective comments.
REFMAIL: CALA3aExMjtRL0tAmgUANpDTnn-_HJ0sYkOEXWzoO6DVaiNFUHQ@mail.gmail.com
The `socket_connect_one` function previously did an `exit(1)` when
encountering any errors with opening the socket. This would break
connecting to a host where multiple possible addrinfos are returned,
where the leading addrinfos are in fact impossible to connect to. E.g.
with a kernel configured without support for IPv6, the `getaddrinfo`
call may still return a hint containing an IPv6 address alongside
another hint with an IPv4 address. Creating the socket with the IPv6
address, which will cause an error, lead us to exiting early without
even trying remaining hints.
While one can argue that the user should have compiled without HAVE_IPV6
or used an appropriate DNS configuration, we can do better by simply
skipping over the current addrinfo causing an error. To do so, we split
out a new function `socket_connect_next`, which selects the next
available address info and subsequently calls `socket_connect_one`
again. When no hints remain, `sock_connect_one` will error out at that
point.
the only legitimate "deviant" UID is zero, meaning "no message". this
can be futher qualified by additional flags in the sync record, rather
than using magic values for the UID. in fact, the zero UID (so far
meaning only "expunged") was already optionally qualifed with "expired".
as a side effect, driver->store_msg() now returns 0 instead of -2 for
unknown UIDs. this was a hack to avoid translating the value later
on, but it made the api horrible, and now it's superflous in the first
place.
this ensures stable results when the boxes are used with different
OPEN_FLAGS (which will happen in a subsequent commit), at the negligible
cost of removing the implicit test of the maildir driver's ability to
enumerate new messages.
do that by wrapping the actual stores into proxies.
the proxy driver's code is auto-generated from function templates, some
parameters, and the declarations of the driver functions themselves.
attempts to do it with CPP macros turned out to be a nightmare.
if the server sends no UIDNEXT, do an initial FETCH to query the UID of
the last message.
same if the server sends no APPENDUID.
this allows us to remove the arbitrary limitation of the UID range to
INT_MAX, at the cost of additional round-trips.
multiple Channels can call driver_t::list_store() with different LIST_*
flags. assuming the flags are actually taken into consideration, using a
single boolean 'listed' flag to track whether the Store still needs to
be listed obviously wouldn't cut it - if INBOX does not live right under
Path and the Channels used entirely disjoint Patterns (say, * and
INBOX*), the second Channel in a single run (probably a Group) would
fail to match anything.
to fix this, make store_t::listed more granular. this also requires
moving its handling back into the drivers (thus reverting c66afdc0),
because the actually performed queries and their possible implicit
results are driver-specific.
note that this slightly pessimizes some cases - e.g., an IMAP Store with
Path "" will now list the entire namespace even if there is only one
Channel with Pattern "INBOX*" (because a hypothetical Pattern "*" would
also include INBOX*, and the queries are kept disjoint to avoid the need
for de-duplication). this isn't expected to be a problem, as listing
mailboxes is generally cheap.
Maildir++ is sufficiently different from the other SubFolder styles to
justify a separate function; the resulting code duplication is minimal,
but the separated functions are a lot clearer.
this also adds code which avoids that the message about excluding the
mailbox is printed multiple times - this could happen with Maildir++, as
the hierarchy is flattened.
amends 0f24ca31b.
we need a separate log entry type which does proper mmaxxuid tracking.
while moving code around, this also removes a redundant debug statement.
amends b1842617.
newmaxuid represents the highest UID for which a sync entry was created,
while maxuid represents the end of the range which is guaranteed to have
been propagated. that means that the former needs to be instantly
incremented (and logged), while the latter must not be touched until the
entire new message sync completes. this matters particularly in the case
of resuming an interrupted run, where sync entry creation must resume
exactly where it left off, while loading the box must use the old limit
to ensure that all messages are available for actual propagation.
we've been using indices to separate master/slave state for a long time,
so there is no point in using pairs of matching brackets to signify the
side in the journal. instead, use somewhat descriptive letters (S[een],
F[ind], T[rashed]) and the index itself.
they are derived from srec->status, which is unsigned. for not
understood reasons, the compiler complains only after extending status
to a full unsigned int.
on the way, localize the declarations.
latest since 77acc268, the code prior to these statements ensures that
the full length is available, so just use memcpy(). the code for
comparing TUIDs uses memcmp() anyway.
introduce recognition of $USE_VALGRIND to run all mbsync invocations
through valgrind.
this also removes the seemingly purposeless --log-fd=3 indirection.
when syncing flags but not re-newing non-fetched messages, there is no
need to query the message size for all messages, as the old ones are
queried only for their flags.
instead of a single hard-coded branch, use a generic method to split
ranges as needed.
this is of course entirely over-engineered as of now, but subsequent
commits will make good use of it.
turns out the comment advising against it was bogus - unlike for
memcmp(), the standard does indeed prescribe that the memchr()
implementation may not read past the first occurrence of the searched
char.
that's what the sources already assumed anyway. size_t is total
overkill, as No Email Ever (TM) will exceed 2GiB.
this also fixes a harmless format string warning in 32 bit builds.
if AuthMechs includes more than just LOGIN and the server announces any
AUTH= mechanism, we try SASL. but that can still fail to find any
suitable authentication mechanism, and we must not error out in that
case if we are supposed to fall back to LOGIN.
specifically, if AuthMechs included more than just LOGIN (which would be
the case for '*') and the server announced any AUTH= mechanism, we'd
immediately error out upon seeing it, thus failing to actually try
LOGIN.
the number was chosen to make queries more comprehensible when the
server sends no UIDNEXT, but it appears that such insanely large UIDs
actually show up in the wild. so send 32-bit INT_MAX instead.
note that this is again making an assumption: that no server uses
unsigned ints for UIDs. but we can't sent UINT_MAX, as that would break
with servers which use signed ints. also, *we* use signed ints (which is
actually a clear violation of the spec).
it would be possible to special-case the range [1,inf] to 1:*, thus
entirely removing arbitrary limits. however, when the range doesn't
start at 1, we may actually get a single message instead of none due to
the imap uid range limits being unordered. this gets really nasty when
we need to issue multiple queries, as we may list the same message
twice.
a reliable way around this would be issuing a separate query to find the
actual value of UID '*', to make up for the server not sending UIDNEXT
in the first place. this would obviously imply an additional round-trip
per mailbox ...
trashing many messages at once inevitably overtaxes m$ exchange, and the
connection breaks. without any progress tracking, it would restart from
scratch each time, which would lead to a) it never finishing and b) many
copies of the messages in the trash.
full transactions as we do for "proper" syncing would be over the top,
as it's not *that* bad if some messages get duplicated in the trash. so
we record only the messages for which trashing completed, thus allowing
some overlap between the attempts.
turns out i misread the spec in a subtle way: while all other folders
are physically nested under INBOX, the IMAP view puts them at the same
(root) level. to get them shown as subfolders of INBOX, they need to
have _two_ leading dots.
this also implies that the Maildir++ mode has no use for a Path, so
reject attempts to specify one.
the mbsync manual says explicitly that the system's default certificate
store should *not* be specified.
however, the isync manual talked about CA certificates, which is (and
always was) exactly wrong.
also adjust both .sample rc files.
flock() may be implemented via fcntl(), which may cause the process to
deadlock itself when trying to apply both types of locks. this is the
case even on linux when the file lives on NFS.
it's unlikely that anything except mbsync would try to access the
.uidvalidity files anyway, so there is no point in trying to be
compatible with anything else ...
REFMAIL: uddy4g589ym.fsf@eismej-u14.spgear.lab.emc.com
it is legal for an email system to simply change the case of rfc2822
headers, and at least one imap server apparently does just that.
this would lead to us not finding our own header, which is obviously not
helpful.
REFMAIL: CA+fD2U3hJEszmvwBsXEpTsaWgJ2Dh373mCESM3M0kg3ZwAYjaw@mail.gmail.com
that pattern may very well expand to INBOXNOT, which would naturally
live under Path, so we need to look into the Path. of course, this
actually makes sense only if there *is* a Path, and complaining about
it being absent is backwards.
the idea that this is even possible was based on an incomplete reading
of the imap spec.
however, the infrastructure for supporting multi-char delimiters as such
is retained, as the Flatten option can be used with them.
recycling server connections skips everything up to setting up the
prefix (Path/NAMESPACE). "everything" should obviously include enabling
compression, as that must be done at most once per connection.
any structures may be invalid after callback invocation.
this has the side effect that the socket write callback now returns
void, like all other callbacks do.
the synchronous writing to the socket would have typically invoked the
write callback, which would flush further commands, thus recursing.
we take the easy way out and make it fully asynchronous, i.e., no data
is sent before (re-)entering the event loop.
this also has the effect that socket_write() cannot fail any more, and
any errors will be reported asynchronously. this is consistent with
socket_read(), and produces cleaner code.
this introduces a marginal performance regression: the maildir driver is
synchronous, so all messages (which fit into memory) will be read before
any data is sent. this is not considered relevant.
the legacy style is a poorly executed attempt at Maildir++, so introduce
the latter for the sake of completeness. but most users will probably
just want to use subfolders without any additional dots.
move the unconditional addition of INBOX out ouf the function.
this makes it possible to move the folder check and addition to the
listing before the recursion, which seems clearer.
in the case of imap stores, the failure is bound to the server config,
not just the store config.
that means that the storage of the failure state needs to be private to
the driver, accessible only through a function.
simply make the code symmetrical to the inverse case.
note that the result will be sort of awkward, as the folders under Path
(and thus the subfolders of Inbox) don't start with a dot, while the
subfolders of these folders do. this needs to be addressed separately.
when we run into Inbox while listing Path, check whether Inbox is being
listed anyway, and just skip it if so, instead of listing it right away
and resetting LIST_INBOX (and thus having a calling order dependency).
USER (the authorization identity) specifies whom to act for.
AUTHNAME (the authentication identity) specifies who is acting (and
thus whose PASS is being used).
USER is derived from AUTHNAME if omitted, but apparently the
GSS-API module automatically adds the REALM, which is not helpful.
it appears to be common to set both USER and AUTHNAME to the same value,
so let's just do it as well.
REFMAIL: 20150407194807.GA1714@leeloo.kyriasis.com
the PassCmd will be typically non-interactive (or it will use a gui
password agent), so starting a new line just makes the progress counter
uglier. so make it configurable and default to no line break.
- the old meaning of -V[V] was moved to -D{n|N}, as these are really
debugging options.
- don't print the info messages by default; this can be re-enabled with
the -V switch, and is implied by most debug options (it was really
kind of stupid that verbose/debug operation disabled these).
- the sync algo/state debugging can be separately enabled with -Ds now.
... instead of determining them on the fly, because
- it enables early display of totals (to be used soon)
- it enables re-use of the data (to be used at some point)
- the code is less cryptic
note that we leak the data created in main(), consistently with other
configuration-related data.
instead of creating three lists of mailboxes (common, master, slave)
and deriving the mailbox presence from the list being processed, create
a single joined list which contains the presence information.
additionally, the list is sorted alphabetically (with INBOX first),
which is neater.
it helps if the code actually does what the comment above it claims.
clarify it a bit, so i don't get stupid ideas again.
This reverts commit cf6a7b4d18.
propagating many messages from a fast store (typically maildir or a
local IMAP server) to a slow asynchronous store could cause gigabytes of
data being buffered. avoid this by throttling fetches if the target
context reports memory usage above a configurable limit.
REFMAIL: 9737edb14457c71af4ed156c1be0ae59@mpcjanssen.nl
some servers actually bother to close down the SSL connection before
closing the socket.
this fixes the spurious "unhandled SSL error 6" messages.
REFMAIL: 20150120114805.GA17586@leeloo.kyriasis.com
the server can actually close the zlib stream before closing the socket,
so we need to accept it.
we don't do anything beyond that - the actual EOF will be signaled by
the socket, and if the server (erroneously) sends more data, zlib will
tell us about it.
REFMAIL: 1423048708-975-1-git-send-email-alex.bennee@linaro.org
zlib reports Z_BUF_ERROR when a flush is attempted without any activity
since the previous flush (if any). while this is harmless as such,
discerning the condition from genuine errors would be much harder than
avoiding the pointless flush in the first place.
REFMAIL: eb5681612f17be777bc8d138d31dd6d6@mpcjanssen.nl
don't retry dead Stores for every Channel.
this also introduces a state for transient errors (specifically, connect
failures), but this is currently unused.
a directory is no mailbox unless it contains a cur/ subdir.
but if that one is present, create new/ and tmp/ if they are missing.
this makes it possible to resume interrupted maildir creations.
don't try to lock it until we actually read or write it.
the idea is to not fail with SyncState * if we tried to load the state
before selecting a non-existing mailbox. this is ok, because if the
mailbox is missing, we obviously have no sync state pertaining to it,
either.
as a side effect, this allows simplifying an error path.