that's mostly hypothetical, but let's not make assumptions.
this also adds EXPUNGE response handling to make total_msgs reliable. in
principle, this affects the post-SELECT UIDNEXT fallback as well, but
there the racing window is so short that this barely improves anything.
amends 94022a67.
the uidnext query following message stores can be interleaved with
message fetches. that means that we cannot rely on the 1st command in
flight being that query. but instead of iterating over all commands in
flight, move the uidnext query flag to imap_store (and make sure to
check for the presence of a message body before testing it) - this
avoids the loop and an extra byte in every command.
this also makes it clear that the query is mutually exclusive with
loading messages (the untagged responses are not distinguishable).
don't say DRV_CANCELED when it's really DRV_STORE_BAD, as apart from
being just wrong, it lead to the confusing effect of canceling a store
as the result of a supposed cancellation of the same store.
properly distribute the certificates between the SSL context's trust
store and our host cert list.
as a drive-by, clean up some nasty type casts at the cost of including
a second OpenSSL header into socket.h.
delay reporting success of STORE FLAGS until a subsequent CHECK
succeeds.
this fixes (inverse flag change propagation) and (deletes not being
propagated) after an interruption due to prematurely logged flag
updates.
delay the creation of the new state and journal until there is actually
something interesting to write. this saves some cpu cycles and prolongs
ssd life a whee bit.
now that expiration order is determined by a single loop ordered by
far-side UIDs, it is no longer necessary to accurately track the highest
seen UID.
as a side effect, this fixes a problem reported (way too long ago) by
Yuri D'Elia: we failed to up newmaxuid for messages we produced
ourselves, so we would keep enumerating the same messages until we also
propagated externally generated messages from that mailbox - which might
have been never for the server side of archive/trash mailboxes.
we can do that, as unpaired near-side messages are ignored anyway.
this mildly changes expiration order, as near-side messages that
existed for a long time but were propagated much later will be expired
later. however, that has no practical relevance.
this is mostly theoretical, as at this point no updates to the message
list can have actually happened. but it's future-proof and consistent
with the near-side loop.
we already didn't propagate messages which would be instantly expunged
from the target, but failed to cancel propagations that were already
scheduled before we got interrupted. this matters a bit when the
resumption happens significantly later than the initial attempt, giving
the user time to mark messages on the source as deleted.
the 'pending' and 'skipped' sync record states are mutually exclusive
with having a complementary message, so there is no point in testing it
explicitly.
amends bd5fb6ff.
we need to pass a different "boundary" UID to driver_t::load_box() for
every OPEN_* flag that queries a partial range:
- OPEN_FIND refers to messages newer than all we know about
- OPEN_OLD_IDS refers to messages which are paired
- OPEN_{OLD,NEW}_SIZE refers to messages (not) above the committed
boundary of already propagated messages
we treated the 3rd like the 2nd, which was just wrong - the actual
boundary may be lower or higher, so we'd produce wrong results when
MaxSize was set and only one of New and ReNew was requested.
the underlying metaphor refers to an inhumane practice, so using it
casually is rightfully offensive to many people. it isn't even a
particularly apt metaphor, as it suggests a strict hierarchy that is
counter to mbsync's highly symmetrical mode of operation.
the far/near terminology has been chosen as the replacement, as it is a
natural fit for the push/pull terminology. on the downside, due to these
not being nouns, a few uses are a bit awkward, and several others had to
be amended to include 'side'. also, it's conceptually quite close to
remote/local, which matches the typical use case, but is maybe a bit too
suggestive of actually non-existing limitations.
the new f/n suffixes of the -C/-R/-X options clash with pre-existing
options, so direct concatenation of short options is even less practical
than before (some suffixes of -D already clashed), but doing that leads
to unreadable command lines anyway.
as with previous deprecations, all pre-existing command line and config
options keep working, but yield a warning. the state files are silently
upgraded.
this is better than using PassCmd, as it allows the keychain manager to
identify the calling process and therefore use a selective whitelist.
unlike in the now removed example, we use an "internet password" for the
imap protocol, rather than a "generic password" - this seems more
appropriate.
based on a patch by Oliver Runge <oliver.runge@gmail.com>
It was already possible to retrieve passwords from arbitrary commands.
But this goes only half the way to allowing automated derivation of
login credentials, as some environments may also have different user
names based on the system. Therefore, add the UserCmd option to
complement PassCmd.
Based on a patch series by Patrick Steinhardt <ps@pks.im>
makes the code less cluttered, and it's harder to introduce leaks.
this has the hypothetical disadvantage that due to freeing being
delayed, the peak memory usage would rise significantly if we chained to
another parse_list() call which produces a big list while already
holding a big list, but that isn't the case anywhere.
... by making a lot of objects unsigned, and some signed.
casts which lose precision and change the sign in one go (ssize_t and
time_t to uint on LP64) are made explicit as well.
this does specifically *not* cover about a bazillion warnings about
size_t being shrunk to uint - these make no sense given the expected
data set size.
mostly ATTR_PRINTFLIKE(*, 0) for functions with a va_list argument.
also, one ATTR_NORETURN and one ATTR_UNUSED, both on functions.
also, an explicit suppression for a format string stored in a variable.
this is mostly to work around the fact that both gcc and clang won't
accept the format string declaration (i.e., will complain with
-Wformat-nonliteral) if the *called* function does not actually take a
va_list.
on the upside, it makes one caller cleaner. yay ...
this is actually potentially counterproductive, as people who have set
SSLVersions and fail to adjust it will _lose_ tls 1.3 support. however,
without the option being there, people (incorrectly) believe that tls
1.3 is not supported.
otherwise the server would interpret it as INBOX contrary to our
expectations, which might lead to moderately surprising effects.
if you really want to sync your ~/maildir/inbox to the IMAP INBOX,
specify it as the Maildir Store's Inbox.
Some distributions (e.g. Fedora) added support for system wide crypto
policies. This is supported in most common crypto libraries including
OpenSSL. Applications can override this policy using their own cipher
string. This commit adds support for specifying the cipher string in
the mbsync configuration.
For example, to exclude Diffie-Hellman, the user can specify
CipherString "DEFAULT:!DH"
in the IMAP Account's configuration.
this is semantically cleaner, and fixes storing the flags in the rare
case that flags are not being synced and the target is not being
expunged, as in this case flags are queried only during the actual
propagation.
the workaround for -Wformat triggered -Wformat-nonliteral in turn.
so instead go back to using pragmas and add a proper gcc version check.
this also works with clang - mostly for qt-creator's code model, which
is clang-based.
amends/reverts 55e65147.
the assumption was that this wouldn't be needed, as maildir_store_msg()
reliably delivers a UID. however, if we crash right before the callback
can record that UID, we'd still use OPEN_FIND in the next run, which
requires the saved next UID.
an effect of 7ce658d is that we can index messages by UID rather than
content (or more specifically, subject). apart from being cleaner, it
allows duplicated subjects.
on modern systems, this makes it likely to end up on tmpfs, which is a
lot faster and ssd-friendlier.
the symlink is not deleted at the end, to minimize fs churn. that means
it will be dangling after a reboot, which gets fixed in the next run.
the operator was exactly inverted. that means that it actually wouldn't
compile with both older versions (that needed the aliases) and
potentially new versions (that will hide the data members - still not
the case as of 3.2).
amends 8a40554f0.
we failed to reset the box list pointer after freeing it, which would
lead to a crash.
we also failed to reset the listing status, which would lead to
malfunction if we hadn't already crashed.
this inlines imap_cleanup_store(), as there isn't much value in keeping
it. the message list is already freed when disowning the store anyway.
we need to deep-copy the struct hostent data, as otherwise the
concurrent connects will overwrite each other's lookup results.
this is a rather hypothetical fix, as the bug currently affects only
channels connecting two IMAP accounts, and only if the first host's
first address asynchronously fails to connect.
this is relevant only when listing an IMAP Store's contents, as that's
the only place where we aren't imposing the spelling ourselves.
we need to be careful not to treat our own canonical (prefix-stripped
and always slash-delimited) box names like that; codify that in
comments.
this reveals that commit 6f2160f1 may be deemed to have been incorrect -
the TODO item was ambiguous, and could quite possibly have meant this
fix. unsurprisingly, 380ccdd4 re-introduced it with more explicit
wording.
the query is untypical enough to have caused problems with davmail (when
we still used *:*) and mailo.com (until it got fixed), so better check
that the result (not) returned by the server makes sense.
we did already set up the timeout when starting to send commands, but so
far we did not reset it when succeeding to send out data. rectify that.
REFFAIL: 87sgy92we3.fsf@jnanam.net
strptime(3)'s "%d" day of the month conversion specifier does not accept
leading blanks in case of single digit numbers. "%e" does that.
While implementation details and differences between the two
day-of-month conversion specifiers vary, none of the major libcs
(incl. OpenBSD, FreeBSD, Illumos, musl) consume a leading blank for "%d"
except glibc, which consumes any number of spaces like in the "%e" case.
Using "%e" ensures that date strings like " 4-Mar-2018 16:49:25 -0500"
are successfully parsed by all major implementations in compliance to
X/Open Portability Guide Issue 4, Version 2 ("XPG4.2"). musl is now the
only one that still treats "%d" and "%e" without stripping any space.
Issue analysed and reported by Evan Silberman <evan@jklol.net> who found
mbsync 1.3.0 on OpenBSD 6.4 to fail with `CopyArrivalDate' set when
syncing mails with the above mentioned timestamp.
See https://marc.info/?l=openbsd-tech&m=155044284526535 for details.
Ater sasl_client_step() is called and the Cyrus SASL library forwards
it to the client plugin, if the result value is OK (authentication
succeeded), the clientout is filled out to be an empty string, even if
the client plugin wanted to return NULL.
To avoid that mbsync complains at this point, check the returned length
instead of the pointer.
turns out that some IMAP servers (e.g., poczta.o2.pl) do not return
messages in ascending UID order in response to a UID FETCH request,
which makes the driver violate the API contract.
counter this by sorting the messages. this also addresses the
long-standing (but hypothetical) issue that parallel UID FETCH requests
could be handled out-of-order and thus also lead to mixed up results.
based on patch by Marcin Niestroj <macius1990w@gmail.com>.
while only 1KiB is required by the IMAP spec, AUTHENTICATE GSSAPI with
Kerberos requires about 1700 bytes.
accomodate that, plus some reserve.
fix suggested by Tollef Fog Heen <tfheen@err.no> via Debian BTS.
maildir supports a 'P' flag which denotes the fact that a message has
been 'passed' on (forwarded, bounced). notmuch syncs this to the
'passed' tag.
Per https://tools.ietf.org/html/rfc5788, IMAP has a user-defined flag
(keyword) '$Forwarded' that is supported by many servers and clients
these days. (Technically, one should check for '$Forwarded' in the
server response.)
Restructure mbsync's flag parser to accept keywords (flags starting with
'$') but still bail out on unknown system flags (flags starting with '\').
Support '$Forwarded' as a first keyword since it maps to maildir's 'P'
and needs to be sorted in between the system flags.
Signed-off-by: Michael J Gruber <github@grubix.eu>
Mailbox driver flags are defined in several places. It is essential that
they are kept in sync, so mark them with the same string for easy
grepping with an alerting boiler plate.
Signed-off-by: Michael J Gruber <github@grubix.eu>
re-using the file name buffer for the headers wasn't such a great idea,
as _POSIX_PATH_MAX is only 256, while RFC2822 permits lines up to 1000
chars. and sure enough, i have a message with a whopping 470-char
message-id header ...
empty strings were previously meaningless, and starting with 72c2d695a,
failure to handle them lead to bogus results when the IMAP hierarchy
separator is legitimately empty (when the server genuinely supports none
and none is manually configured). non-null can be asserted more cleanly
than null-or-non-empty, so change the api like that.
incidentally, this also removes the need to work around gcc's bogus
warning in -Os mode.
problem found by "Casper Ti. Vector" <caspervector@gmail.com>
apple gcc 4.2 complains about the use of the pragma inside a function.
clang also complains, but because the pragma is entirely unknown to it.
as neither compiler emits the bogus warning in the first place, there is
no point in suppressing it anyway.
our current project structure precludes the clash between some indirect
include of ssl.h and our definition of 'S' (or 'M', i don't remember)
that happened on some system, so there is no need to avoid including it
any more.
this avoids complaints from some more picky compilers, as re-defining
typedefs (even to the same thing) is illegal before C11.
there is no reason not to, and debian even disabled 1.0 globally,
because it's (theoretically) too insecure in some contexts (BEAST
attack).
in the compat wrapper, the UseTLSv1 option has been re-interpreted as
v1.x, to avoid adding new options.
while that's just bad api, inflate() can return Z_BUF_ERROR during
normal operation.
contrary to the zpipe example and what the documentation implies,
deflate() actually isn't that braindead. add respective comments.
REFMAIL: CALA3aExMjtRL0tAmgUANpDTnn-_HJ0sYkOEXWzoO6DVaiNFUHQ@mail.gmail.com
The `socket_connect_one` function previously did an `exit(1)` when
encountering any errors with opening the socket. This would break
connecting to a host where multiple possible addrinfos are returned,
where the leading addrinfos are in fact impossible to connect to. E.g.
with a kernel configured without support for IPv6, the `getaddrinfo`
call may still return a hint containing an IPv6 address alongside
another hint with an IPv4 address. Creating the socket with the IPv6
address, which will cause an error, lead us to exiting early without
even trying remaining hints.
While one can argue that the user should have compiled without HAVE_IPV6
or used an appropriate DNS configuration, we can do better by simply
skipping over the current addrinfo causing an error. To do so, we split
out a new function `socket_connect_next`, which selects the next
available address info and subsequently calls `socket_connect_one`
again. When no hints remain, `sock_connect_one` will error out at that
point.
the only legitimate "deviant" UID is zero, meaning "no message". this
can be futher qualified by additional flags in the sync record, rather
than using magic values for the UID. in fact, the zero UID (so far
meaning only "expunged") was already optionally qualifed with "expired".
as a side effect, driver->store_msg() now returns 0 instead of -2 for
unknown UIDs. this was a hack to avoid translating the value later
on, but it made the api horrible, and now it's superflous in the first
place.
this ensures stable results when the boxes are used with different
OPEN_FLAGS (which will happen in a subsequent commit), at the negligible
cost of removing the implicit test of the maildir driver's ability to
enumerate new messages.
do that by wrapping the actual stores into proxies.
the proxy driver's code is auto-generated from function templates, some
parameters, and the declarations of the driver functions themselves.
attempts to do it with CPP macros turned out to be a nightmare.
if the server sends no UIDNEXT, do an initial FETCH to query the UID of
the last message.
same if the server sends no APPENDUID.
this allows us to remove the arbitrary limitation of the UID range to
INT_MAX, at the cost of additional round-trips.
multiple Channels can call driver_t::list_store() with different LIST_*
flags. assuming the flags are actually taken into consideration, using a
single boolean 'listed' flag to track whether the Store still needs to
be listed obviously wouldn't cut it - if INBOX does not live right under
Path and the Channels used entirely disjoint Patterns (say, * and
INBOX*), the second Channel in a single run (probably a Group) would
fail to match anything.
to fix this, make store_t::listed more granular. this also requires
moving its handling back into the drivers (thus reverting c66afdc0),
because the actually performed queries and their possible implicit
results are driver-specific.
note that this slightly pessimizes some cases - e.g., an IMAP Store with
Path "" will now list the entire namespace even if there is only one
Channel with Pattern "INBOX*" (because a hypothetical Pattern "*" would
also include INBOX*, and the queries are kept disjoint to avoid the need
for de-duplication). this isn't expected to be a problem, as listing
mailboxes is generally cheap.
Maildir++ is sufficiently different from the other SubFolder styles to
justify a separate function; the resulting code duplication is minimal,
but the separated functions are a lot clearer.
this also adds code which avoids that the message about excluding the
mailbox is printed multiple times - this could happen with Maildir++, as
the hierarchy is flattened.
amends 0f24ca31b.
we need a separate log entry type which does proper mmaxxuid tracking.
while moving code around, this also removes a redundant debug statement.
amends b1842617.
newmaxuid represents the highest UID for which a sync entry was created,
while maxuid represents the end of the range which is guaranteed to have
been propagated. that means that the former needs to be instantly
incremented (and logged), while the latter must not be touched until the
entire new message sync completes. this matters particularly in the case
of resuming an interrupted run, where sync entry creation must resume
exactly where it left off, while loading the box must use the old limit
to ensure that all messages are available for actual propagation.
we've been using indices to separate master/slave state for a long time,
so there is no point in using pairs of matching brackets to signify the
side in the journal. instead, use somewhat descriptive letters (S[een],
F[ind], T[rashed]) and the index itself.
they are derived from srec->status, which is unsigned. for not
understood reasons, the compiler complains only after extending status
to a full unsigned int.
on the way, localize the declarations.
latest since 77acc268, the code prior to these statements ensures that
the full length is available, so just use memcpy(). the code for
comparing TUIDs uses memcmp() anyway.
introduce recognition of $USE_VALGRIND to run all mbsync invocations
through valgrind.
this also removes the seemingly purposeless --log-fd=3 indirection.