the messages are trashed in mailbox (and thus UID) order, and in
practice we expect the operations to complete in order. however, if
older messages need to be trashed after a journal replay, and we get
interrupted again, the next replay would produce an unsorted array,
and thus break the binary search.
amends 2bba9b9.
the input isn't necessarily null-terminated (it currently is for imap,
but not for maildir), so if the message ended somewhere within the
header field name, we'd read beyond its end, which theoretically could
cause a crash. no other adverse effects could result, as we'd stop
processing such a broken message right afterwards.
amends 70bad661.
that shouldn't really be a problem, as we have 2GB of headroom, and most
growth would happen when sending an all-newlines message from maildir to
imap (due to CR additions), which is mostly non-critical. but better
safe than sorry.
when a broken/compromised/malicious server gives us a message that
starts with an empty line, we'd enter the path for inserting a pristine
placeholder subject, for which we unfortunately didn't actually allocate
space (unless MaxSize is in use and the message exceeds it).
note that this cannot be triggered by merely receiving a crafted mail
with no headers (yes, it's actually possible to send such a thing), as
the delivery of mails adds plenty of headers.
amends 70bad661.
in particular, this covers the case of a mailbox being replaced with an
empty new one, which would subsequently lead to the opposite end being
emptied as well, which would typically be undesired.
also add plenty of comments.
don't print the actual values, which are meaningless technicalities
to the average user, and can be obtained separately for debugging if
really necessary.
also, fix the omission of the affected mailboxes from one of the
messages.
don't say DRV_CANCELED when it's really DRV_STORE_BAD, as apart from
being just wrong, it lead to the confusing effect of canceling a store
as the result of a supposed cancellation of the same store.
delay the creation of the new state and journal until there is actually
something interesting to write. this saves some cpu cycles and prolongs
ssd life a whee bit.
now that expiration order is determined by a single loop ordered by
far-side UIDs, it is no longer necessary to accurately track the highest
seen UID.
as a side effect, this fixes a problem reported (way too long ago) by
Yuri D'Elia: we failed to up newmaxuid for messages we produced
ourselves, so we would keep enumerating the same messages until we also
propagated externally generated messages from that mailbox - which might
have been never for the server side of archive/trash mailboxes.
we can do that, as unpaired near-side messages are ignored anyway.
this mildly changes expiration order, as near-side messages that
existed for a long time but were propagated much later will be expired
later. however, that has no practical relevance.
this is mostly theoretical, as at this point no updates to the message
list can have actually happened. but it's future-proof and consistent
with the near-side loop.
we already didn't propagate messages which would be instantly expunged
from the target, but failed to cancel propagations that were already
scheduled before we got interrupted. this matters a bit when the
resumption happens significantly later than the initial attempt, giving
the user time to mark messages on the source as deleted.
the 'pending' and 'skipped' sync record states are mutually exclusive
with having a complementary message, so there is no point in testing it
explicitly.
amends bd5fb6ff.
we need to pass a different "boundary" UID to driver_t::load_box() for
every OPEN_* flag that queries a partial range:
- OPEN_FIND refers to messages newer than all we know about
- OPEN_OLD_IDS refers to messages which are paired
- OPEN_{OLD,NEW}_SIZE refers to messages (not) above the committed
boundary of already propagated messages
we treated the 3rd like the 2nd, which was just wrong - the actual
boundary may be lower or higher, so we'd produce wrong results when
MaxSize was set and only one of New and ReNew was requested.
the underlying metaphor refers to an inhumane practice, so using it
casually is rightfully offensive to many people. it isn't even a
particularly apt metaphor, as it suggests a strict hierarchy that is
counter to mbsync's highly symmetrical mode of operation.
the far/near terminology has been chosen as the replacement, as it is a
natural fit for the push/pull terminology. on the downside, due to these
not being nouns, a few uses are a bit awkward, and several others had to
be amended to include 'side'. also, it's conceptually quite close to
remote/local, which matches the typical use case, but is maybe a bit too
suggestive of actually non-existing limitations.
the new f/n suffixes of the -C/-R/-X options clash with pre-existing
options, so direct concatenation of short options is even less practical
than before (some suffixes of -D already clashed), but doing that leads
to unreadable command lines anyway.
as with previous deprecations, all pre-existing command line and config
options keep working, but yield a warning. the state files are silently
upgraded.
... by making a lot of objects unsigned, and some signed.
casts which lose precision and change the sign in one go (ssize_t and
time_t to uint on LP64) are made explicit as well.
this does specifically *not* cover about a bazillion warnings about
size_t being shrunk to uint - these make no sense given the expected
data set size.
mostly ATTR_PRINTFLIKE(*, 0) for functions with a va_list argument.
also, one ATTR_NORETURN and one ATTR_UNUSED, both on functions.
also, an explicit suppression for a format string stored in a variable.
this is semantically cleaner, and fixes storing the flags in the rare
case that flags are not being synced and the target is not being
expunged, as in this case flags are queried only during the actual
propagation.
maildir supports a 'P' flag which denotes the fact that a message has
been 'passed' on (forwarded, bounced). notmuch syncs this to the
'passed' tag.
Per https://tools.ietf.org/html/rfc5788, IMAP has a user-defined flag
(keyword) '$Forwarded' that is supported by many servers and clients
these days. (Technically, one should check for '$Forwarded' in the
server response.)
Restructure mbsync's flag parser to accept keywords (flags starting with
'$') but still bail out on unknown system flags (flags starting with '\').
Support '$Forwarded' as a first keyword since it maps to maildir's 'P'
and needs to be sorted in between the system flags.
Signed-off-by: Michael J Gruber <github@grubix.eu>
Mailbox driver flags are defined in several places. It is essential that
they are kept in sync, so mark them with the same string for easy
grepping with an alerting boiler plate.
Signed-off-by: Michael J Gruber <github@grubix.eu>
empty strings were previously meaningless, and starting with 72c2d695a,
failure to handle them lead to bogus results when the IMAP hierarchy
separator is legitimately empty (when the server genuinely supports none
and none is manually configured). non-null can be asserted more cleanly
than null-or-non-empty, so change the api like that.
incidentally, this also removes the need to work around gcc's bogus
warning in -Os mode.
problem found by "Casper Ti. Vector" <caspervector@gmail.com>
the only legitimate "deviant" UID is zero, meaning "no message". this
can be futher qualified by additional flags in the sync record, rather
than using magic values for the UID. in fact, the zero UID (so far
meaning only "expunged") was already optionally qualifed with "expired".
as a side effect, driver->store_msg() now returns 0 instead of -2 for
unknown UIDs. this was a hack to avoid translating the value later
on, but it made the api horrible, and now it's superflous in the first
place.
do that by wrapping the actual stores into proxies.
the proxy driver's code is auto-generated from function templates, some
parameters, and the declarations of the driver functions themselves.
attempts to do it with CPP macros turned out to be a nightmare.
we need a separate log entry type which does proper mmaxxuid tracking.
while moving code around, this also removes a redundant debug statement.
amends b1842617.
newmaxuid represents the highest UID for which a sync entry was created,
while maxuid represents the end of the range which is guaranteed to have
been propagated. that means that the former needs to be instantly
incremented (and logged), while the latter must not be touched until the
entire new message sync completes. this matters particularly in the case
of resuming an interrupted run, where sync entry creation must resume
exactly where it left off, while loading the box must use the old limit
to ensure that all messages are available for actual propagation.
we've been using indices to separate master/slave state for a long time,
so there is no point in using pairs of matching brackets to signify the
side in the journal. instead, use somewhat descriptive letters (S[een],
F[ind], T[rashed]) and the index itself.
they are derived from srec->status, which is unsigned. for not
understood reasons, the compiler complains only after extending status
to a full unsigned int.
on the way, localize the declarations.
when syncing flags but not re-newing non-fetched messages, there is no
need to query the message size for all messages, as the old ones are
queried only for their flags.
that's what the sources already assumed anyway. size_t is total
overkill, as No Email Ever (TM) will exceed 2GiB.
this also fixes a harmless format string warning in 32 bit builds.
trashing many messages at once inevitably overtaxes m$ exchange, and the
connection breaks. without any progress tracking, it would restart from
scratch each time, which would lead to a) it never finishing and b) many
copies of the messages in the trash.
full transactions as we do for "proper" syncing would be over the top,
as it's not *that* bad if some messages get duplicated in the trash. so
we record only the messages for which trashing completed, thus allowing
some overlap between the attempts.
it is legal for an email system to simply change the case of rfc2822
headers, and at least one imap server apparently does just that.
this would lead to us not finding our own header, which is obviously not
helpful.
REFMAIL: CA+fD2U3hJEszmvwBsXEpTsaWgJ2Dh373mCESM3M0kg3ZwAYjaw@mail.gmail.com
the legacy style is a poorly executed attempt at Maildir++, so introduce
the latter for the sake of completeness. but most users will probably
just want to use subfolders without any additional dots.
- the old meaning of -V[V] was moved to -D{n|N}, as these are really
debugging options.
- don't print the info messages by default; this can be re-enabled with
the -V switch, and is implied by most debug options (it was really
kind of stupid that verbose/debug operation disabled these).
- the sync algo/state debugging can be separately enabled with -Ds now.
propagating many messages from a fast store (typically maildir or a
local IMAP server) to a slow asynchronous store could cause gigabytes of
data being buffered. avoid this by throttling fetches if the target
context reports memory usage above a configurable limit.
REFMAIL: 9737edb14457c71af4ed156c1be0ae59@mpcjanssen.nl
don't try to lock it until we actually read or write it.
the idea is to not fail with SyncState * if we tried to load the state
before selecting a non-existing mailbox. this is ok, because if the
mailbox is missing, we obviously have no sync state pertaining to it,
either.
as a side effect, this allows simplifying an error path.