Commit Graph

224 Commits

Author SHA1 Message Date
Oswald Buddenhagen
0c36655201 print actually read TUID in debug message 2016-12-26 16:20:27 +01:00
Oswald Buddenhagen
1330f43034 null-terminate lines read from state file & journal
makes the subsequent code less convoluted.
2016-12-26 16:20:27 +01:00
Oswald Buddenhagen
4db64967c9 make more use of shifted_bit()
technically, this introduces a redundant AND, but the compiler is smart
enough to prove that (((A & M) ^ B) & M) == ((A ^ B) & M).
2016-12-18 22:03:51 +01:00
Oswald Buddenhagen
2bba9b903c wrap message trashing into simple transactions
trashing many messages at once inevitably overtaxes m$ exchange, and the
connection breaks. without any progress tracking, it would restart from
scratch each time, which would lead to a) it never finishing and b) many
copies of the messages in the trash.

full transactions as we do for "proper" syncing would be over the top,
as it's not *that* bad if some messages get duplicated in the trash. so
we record only the messages for which trashing completed, thus allowing
some overlap between the attempts.
2016-11-06 09:26:16 +01:00
Oswald Buddenhagen
5b0c8cfa60 use a temporary for sanity 2016-11-05 18:16:43 +01:00
Oswald Buddenhagen
ae95490d52 pre-sort exception list passed to driver->load_box()
... and use that to optimize the maildir driver somewhat.
2016-11-05 17:32:34 +01:00
Oswald Buddenhagen
7b567164ff abstract growable arrays somewhat
... and sneak in a C99 requirement on the way. just because.
2016-11-05 17:32:34 +01:00
Oswald Buddenhagen
7ddd8d1737 Merge branch 'isync_1_2_branch' 2015-11-08 12:04:44 +01:00
Oswald Buddenhagen
8979ebbdf2 tolerate case changes in X-TUID header name
it is legal for an email system to simply change the case of rfc2822
headers, and at least one imap server apparently does just that.
this would lead to us not finding our own header, which is obviously not
helpful.

REFMAIL: CA+fD2U3hJEszmvwBsXEpTsaWgJ2Dh373mCESM3M0kg3ZwAYjaw@mail.gmail.com
2015-09-01 15:40:54 +02:00
Oswald Buddenhagen
549e6739e8 support verbatim and real Maildir++ subfolder naming styles
the legacy style is a poorly executed attempt at Maildir++, so introduce
the latter for the sake of completeness. but most users will probably
just want to use subfolders without any additional dots.
2015-05-01 20:53:23 +02:00
Oswald Buddenhagen
0e1f8f9a3f revamp console output options
- the old meaning of -V[V] was moved to -D{n|N}, as these are really
  debugging options.
- don't print the info messages by default; this can be re-enabled with
  the -V switch, and is implied by most debug options (it was really
  kind of stupid that verbose/debug operation disabled these).
- the sync algo/state debugging can be separately enabled with -Ds now.
2015-03-30 10:31:26 +02:00
Oswald Buddenhagen
8aa22a62e7 make progress counters global
which means they are now cumulative, and include channels and boxes.
2015-03-30 10:30:35 +02:00
Oswald Buddenhagen
a8b26dc4ac soft-limit peak memory usage
propagating many messages from a fast store (typically maildir or a
local IMAP server) to a slow asynchronous store could cause gigabytes of
data being buffered. avoid this by throttling fetches if the target
context reports memory usage above a configurable limit.

REFMAIL: 9737edb14457c71af4ed156c1be0ae59@mpcjanssen.nl
2015-02-15 18:13:05 +01:00
Oswald Buddenhagen
d9a983add6 add support for propagating folder deletions 2015-01-17 17:51:20 +01:00
Oswald Buddenhagen
926788f3ae supplement open_box() with box existence information from list_store()
there is no point in trying to open a non-existing box before trying to
create it.
2015-01-11 15:05:29 +01:00
Oswald Buddenhagen
7b7304b625 split create_box() off from open_box()
this allows us to do something else than creating missing boxes
depending on circumstances. hypothetically, that is.
2015-01-11 15:05:29 +01:00
Oswald Buddenhagen
f1809ddd2b open the mailboxes after loading the sync state
this allows us to react differently to a box'es absence depending on the
state. hypothetically, so far.
2015-01-11 15:05:29 +01:00
Oswald Buddenhagen
f43617cd94 lock sync state lazily
don't try to lock it until we actually read or write it.
the idea is to not fail with SyncState * if we tried to load the state
before selecting a non-existing mailbox. this is ok, because if the
mailbox is missing, we obviously have no sync state pertaining to it,
either.
as a side effect, this allows simplifying an error path.
2015-01-11 15:05:29 +01:00
Oswald Buddenhagen
fb19d644f7 split off open_box() from select_box()
aka prepare_paths() reloaded. we'll need it in a moment.
2015-01-11 15:05:29 +01:00
Oswald Buddenhagen
97a42cd825 factor out {prepare,lock,save,load}_state() 2015-01-11 15:05:29 +01:00
Oswald Buddenhagen
9982e7bf08 make some driver function names more descriptive 2015-01-11 15:05:29 +01:00
Oswald Buddenhagen
00ebf45be2 rename driver::prepare_opts() => prepare_load()
... and move it to the right place in the structure and fix the doc to
not claim that it is called before select().
2015-01-11 15:05:29 +01:00
Oswald Buddenhagen
42cedc8f81 introduce uchar, ushort & uint typedefs 2015-01-11 15:05:28 +01:00
Oswald Buddenhagen
b730f66f7d Merge branch 'isync_1_1_branch' into HEAD
Conflicts:
	src/socket.c
2015-01-11 14:32:15 +01:00
Oswald Buddenhagen
2fa75cf159 fix UID assignment with some non-UIDPLUS servers
the seznam.cz IMAP server seems very eager to send UIDNEXT responses
despite not supporting UIDPLUS. this doesn't appear to be a particularly
sensible combination, but it's valid nonetheless.

however, that means that we need to save the UIDNEXT value before we
start storing messages, lest imap_find_new_msgs() will simply overlook
them. we do that outside the driver, in an already present field - this
actually makes the main path more consistent with the journal recovery
path.

analysis by Tomas Tintera <trosos@seznam.cz>.

REFMAIL: 20141220215032.GA10115@kyvadlo.trosos.seznam.cz
2015-01-11 14:29:19 +01:00
Oswald Buddenhagen
958af473a0 fix conditional for early failure in cancel_done() 2015-01-02 12:38:48 +01:00
Oswald Buddenhagen
f377e7b696 introduce FieldDelimiter and InfoDelimiter options
... for windows fs compatibility.

the maildir-specific InfoDelimiter inherits the global FieldDelimiter
(which affects SyncState), based on the assumption that if the sync
state is on a windows FS, the mailboxes certainly will be as well, while
the inverse is not necessarily true (when running on unix, anyway).

REFMAIL: <CA+m_8J1ynqAjHRJagvKt9sb31yz047Q7NH-ODRmHOKyfru8vtA@mail.gmail.com>
2014-10-25 17:42:48 +02:00
Oswald Buddenhagen
85fd5ceb54 move orig_name out of store_t
it's state specific to the synchronizer.
2014-10-25 15:06:50 +02:00
Oswald Buddenhagen
47897d2403 fix memory management of current mailbox name
it was a stupid idea to store the pointer to a variable we need to
dispose in a structure which has its own lifetime.
2014-10-04 18:37:34 +02:00
Oswald Buddenhagen
4f383a8074 stop abusing memcmp()
memcmp() is unfortunately not guaranteed to read forward byte-by-byte,
which means that the clever use as a strncmp() without the pointless
strlen()s is not permitted, and can actually misbehave with
SSE-optimized string functions.

so implement proper equals() and starts_with() functions. as a bonus,
the calls are less cryptic.
2014-10-04 18:37:34 +02:00
Oswald Buddenhagen
526231bc22 initialize store_t::name
the field is marked foreign (for the drivers), so a recycled store may
contain an old pointer in it. that would make our error path crash.

REFMAIL: CAF_KswU7aBS7unnK+rdZy1PG_8SZUAW=tcg75HixDLLE0w3Lhw@mail.gmail.com
2014-07-02 08:50:22 +02:00
Oswald Buddenhagen
29b07ca7a6 actually print the faulty mailbox name, not some garbage
REFMAIL: CAF_KswU7aBS7unnK+rdZy1PG_8SZUAW=tcg75HixDLLE0w3Lhw@mail.gmail.com
2014-07-02 08:49:47 +02:00
Oswald Buddenhagen
2d4bc1e613 error-check committing of sync state
a failure here is rather unlikely, but let's be pedantic.
a failure is not fatal (we'll just enter the journal replay path next
time), so only print warnings.

found by coverity.
2014-04-12 18:31:18 +02:00
Oswald Buddenhagen
aa0118d047 better error messages for sync state and journal related errors
we can make perfectly good use of errno here.
2014-04-12 18:30:09 +02:00
Oswald Buddenhagen
c6ddad6ac4 remove pointless/counterproductive "Disk full?" error message suffixes
the affected functions will set errno to ENOSPC when necessary.
2014-04-12 18:28:21 +02:00
Oswald Buddenhagen
c5f2943ff6 don't crash in message expiration debug print
we would try to print the uids from the non-existing srec of unpaired
messages while preparing expiration.
this would happen only if a) MaxMessages was configured and b) new
messages appeared on the slave but we were not pushing, so it's a bit of
a corner case.

found by coverity.
2014-04-12 15:28:28 +02:00
Oswald Buddenhagen
6d2fd370a6 fix _POSIX_SYNCHRONIZED_IO usage
it can be -1 for unsupported, or 0 for runtime detection (which we don't
do).
2014-01-02 21:09:09 +01:00
Oswald Buddenhagen
359091625d MaxMessages: ignore entries with no master while calculating bulk fetch 2013-12-13 15:38:50 +01:00
Oswald Buddenhagen
2bbd07ec87 adjust comments to new reality 2013-12-11 16:29:34 +01:00
Oswald Buddenhagen
5a21042e98 ensure sequencing of message propagation and store closing
by putting the message propagation last, d3f634702 uncovered a
long-standing problem: we might have closed the source store before all
messages were propagated from it.
2013-12-11 16:29:33 +01:00
Oswald Buddenhagen
c47ee1c8c4 fix error paths wrt sync drivers, take 3
msgs_copied() was not checked at all, and msgs_flags_set() was doing it
wrong (sync_close() was not checked).

instead of trying to fix/extend the msgs_flags_set() model (ref-counting
and cancelation checking in lower-level functions, and return values to
propagate the status), place the refs/derefs around higher-level scopes
and do the checking only there. this is effectively simpler, and does
away with some obscure macros.
2013-12-11 16:29:33 +01:00
Oswald Buddenhagen
03b3b566f1 reshuffle sources a bit
split header and move some code to more logical places.
2013-12-08 23:19:12 +01:00
Oswald Buddenhagen
71524cb6b0 reduce FSync option to a boolean
there is no use for Thorough mode any more, so simplify the
configuration.
2013-12-08 11:12:09 +01:00
Oswald Buddenhagen
29a56e2dc4 don't fsync after logging every TUID
as we now don't actually start propagating new messages until all TUIDs
have been generated, it's sufficient to sync just once. this makes it
a cheap operation, so we can do it at SYNC_NORMAL level already.
2013-12-08 11:12:09 +01:00
Oswald Buddenhagen
8d5bd62537 add ExpireUnread option 2013-12-08 11:12:09 +01:00
Oswald Buddenhagen
c0ba0c7ecf replace global_* with a channel_conf_t instance
this makes the (growing) list of getopt_helper()'s parameters
manageable. the few wasted bytes are worth it.
2013-12-08 11:12:09 +01:00
Oswald Buddenhagen
49a32910a7 move handling of new messages after that of old ones
i.e., move it back. whatever the original reason was, it's now gone.

this order is way more natural, which allows us to remove the osrecadd
and S_DONE hacks.
2013-12-01 13:36:28 +01:00
Oswald Buddenhagen
b1842617f7 make MaxMessages work for new mails as well
this helps enormously on the first sync of a 100k message box with a
limit of 1k messages. it also happens to make the syncing idempotent.

in a few conditionals we now explicitly test for max_messages being
enabled, not smaxxuid != 0, as after the initial fetch with no important
messages smaxxuid is zero, but we still have to skip over 99k messages
in the above case.
2013-12-01 13:36:28 +01:00
Oswald Buddenhagen
d3f6347021 delay propagation of new messages
previous sequence:
  examine & propagate new => examine old => propagate old
new sequence:
  examine new => examine old => propagate new => propagate old

this alone does not buy us much ...
2013-12-01 13:36:28 +01:00
Oswald Buddenhagen
391ec01f28 make message propagation recording less magic
assign the sync record to the source message asap, and later on rely
on a more explicit condition than not doing so.
2013-12-01 13:36:28 +01:00
Oswald Buddenhagen
7f784fd235 log maxuid bumping less aggressively
we can bump the internal variable whereever convenient, but we cannot
log it until we know that all messages were copied, as otherwise we
could miss some new messages after an interruption. with the new
approach, interruption would merely cause some additonal traffic.
2013-12-01 13:36:27 +01:00
Oswald Buddenhagen
8b76412b0d document message expiration transactions 2013-12-01 13:36:27 +01:00
Oswald Buddenhagen
ecb4c7ab07 propagate deletions with other flag changes
less code duplication, more logical order of issued driver commands
(especially after the next commit), and the "side effect" of letting the
message expiration code see those deletions if they are asynchronous.
2013-12-01 13:36:27 +01:00
Oswald Buddenhagen
273ac899f3 don't delay loading master even if messages were expired
the delay optimized the corner case of previously important but now
expired messages on the slave disappearing, either through an external
expunge or after a journal replay. no point in pessimizing the common
case.
2013-12-01 13:36:27 +01:00
Oswald Buddenhagen
12676f28da remove cleanup of expired entries during setup of master load
the removed code would only ever trigger if a) we were after a journal
replay or b) something external expunged the expired messages - both are
corner cases not worth the extra code.
however, this means that the syncing code further down now needs to take
care of these zombies.
in the end, the normal cleanup will take care of all expired entries,
new and old.
2013-12-01 13:36:27 +01:00
Oswald Buddenhagen
9a62521cff micro-optimization/-clarification: swap condition order 2013-12-01 13:36:27 +01:00
Oswald Buddenhagen
014d9b9081 make message counting in expiration code less confusing 2013-12-01 13:36:27 +01:00
Oswald Buddenhagen
83b834cdfd count unread messages like flagged messages when expiring
that is, don't count them towards the total only below the cut-off
point. making them extend the working set even though they are inside it
is counterintuitive.
2013-12-01 13:36:27 +01:00
Oswald Buddenhagen
9e186ae88b use post-sync "seen" flag to determine expirability
otherwise it wouldn't be idempotent.
2013-12-01 13:36:27 +01:00
Oswald Buddenhagen
15216947fb don't protect recent messages from MaxMessages
while maildir has a clearly defined meaning of "recent" and for example
mutt handles it graciously, IMAP's definition is fubared to the point
that some servers (for example gmail) simply refuse to support it.
for symmetry reasons it is best to pretend that it doesn't exist at all.
it doesn't seem too useful anyway (the user can simply mark the messages
as read to allow pruning).
and last but not least, the man page of mbsync says nothing about
"recent", only "unread". unlike the isync man page, though.
2013-12-01 13:36:27 +01:00
Oswald Buddenhagen
6b7b2b1106 always get slave flags when we are expiring
even if we are not propagating new messages, the appearance of new
messages on the slave can lead to expiring older messages. for that, we
need to know their importance, and thus flags.

the alternative would be not doing an expiration run when not fetching
new messages, but that would mean more conditionals all over the place.
as the decision is somewhat arbitrary, just do the simpler thing.
2013-12-01 13:36:27 +01:00
Oswald Buddenhagen
f1eea7d9a5 do not trash expired messages
we are not actually deleting them, so there is no point in saving them
in the trash.
2013-12-01 13:36:27 +01:00
Oswald Buddenhagen
48754ecc74 make sync state header format less obscure
the header is not space-critical, so use proper name-value pairs.
this has the additional advantage that subsequent format changes can be
done much easier.
2013-12-01 13:36:27 +01:00
Oswald Buddenhagen
3dcb393de2 set srec->msg[] when finding messages by tuid
otherwise we would propagate phantom deletions.

this affected only sync runs after an interruption while storing
messages, so it went (mostly?) unnoticed.
2013-12-01 13:36:27 +01:00
Oswald Buddenhagen
3814f19661 remove pointless assignment
we already know that tmsg->srec is null at this point.
2013-12-01 13:36:27 +01:00
Oswald Buddenhagen
e63e16ab45 assert no stray TUIDs 2013-12-01 13:36:27 +01:00
Oswald Buddenhagen
32def5dc0a add/fix comments and improve debug messages 2013-12-01 13:36:26 +01:00
Oswald Buddenhagen
a9a331c98a simplify condition
... and document the cases.
2013-12-01 13:35:02 +01:00
Oswald Buddenhagen
03f8bfdfb2 micro-optimization/-clarification 2013-12-01 13:35:01 +01:00
Oswald Buddenhagen
00076a6971 move initializations for clarity 2013-12-01 13:35:01 +01:00
Oswald Buddenhagen
61ef099cd5 MaxMessages: make condition exactly symmetrical to condition below 2013-12-01 13:35:01 +01:00
Oswald Buddenhagen
080740f867 rewrite condition for readability and consistency 2013-12-01 13:35:01 +01:00
Oswald Buddenhagen
b10fd0c21c remove assumption about value of M constant 2013-12-01 13:35:01 +01:00
Oswald Buddenhagen
a893cba483 fix enum abuse
amends 9c86ec344.

S_FIND was for the sync record status field. it has no business in the
sync vars status fields. its value coincided with ST_SELECTED, which
luckily only means that we always tried to match up TUIDs even if there
was nothing to do.

the need for TUID matching arises in two mostly independent
circumstances, so add two separate flags ST_FIND_{OLD,NEW}.
2013-12-01 13:35:01 +01:00
Oswald Buddenhagen
0b59ee0df3 support multi-character path separators
this applies to both the IMAP PathDelimiter (which is needed by Lotus
Domino), as well as the Flatten-ed separators.
2013-08-11 10:20:02 +02:00
Oswald Buddenhagen
eb1f10762f added sync support for the arrival date of messages
initial patch by Marc Hoersken <info@marc-hoersken.de>
2013-08-03 18:54:34 +02:00
Oswald Buddenhagen
6577bf3e61 warn if we cannot find some messages by TUID 2013-07-27 20:18:20 +02:00
Oswald Buddenhagen
1847a4e12d make better use of ATTR_UNUSED 2013-07-27 18:44:26 +02:00
Oswald Buddenhagen
5ad83b4e6a don't unnecessarily use continue 2013-07-27 09:34:17 +02:00
Oswald Buddenhagen
e4243debb6 use INT_MAX instead of zero for "no size limit"
this simplifies the actual conditions
2013-07-27 09:34:17 +02:00
Oswald Buddenhagen
ca3a319e60 update copyrights 2013-04-20 16:57:16 +02:00
Oswald Buddenhagen
9261897629 don't record newuid in the sync state
this value is only ever used to find just pushed messages by TUID, so we
can simply use the UIDNEXT value from before we started pushing - and of
course, we need to record that in the journal. it makes no sense to log
the new value after completing a search, as there won't be a next search
before we push the next messages.
2013-03-30 16:46:18 +01:00
Oswald Buddenhagen
96be183acb rename sync_vars_t::uidnext => newuid & fix comment
the purpose of this variable is to hold the UIDNEXT value from before
we started pushing new messages, i.e., the minimal uid we can expect
them to have.
2013-03-30 16:46:18 +01:00
Oswald Buddenhagen
d7eae525bd fix TrashRemoteNew copy direction 2012-09-22 17:35:39 +02:00
Oswald Buddenhagen
35851f133b add option to control amount of fsync()ing 2012-09-15 15:28:15 +02:00
Oswald Buddenhagen
49223b2df2 avoid that a system crash can cause messages to be propagated twice
fdatasync() the journal after creating the pair record and recording
the TUID, but before the message propagation actually starts.

all other writes to the journal are not flushed, as they will at worst
cause some unnecessary network traffic without visible effect.
2012-09-15 15:28:15 +02:00
Oswald Buddenhagen
df6c3b64b7 avoid that a system crash can clobber the sync state file
make sure that the new state is committed to disk before overwriting the
old version - by default meta data is committed first, so we may end up
with no valid state at all otherwise.
2012-09-15 13:25:50 +02:00
Oswald Buddenhagen
f11504aa07 update copyrights
make the wrapper's help string also mention copyrights pertaining only
to the actual syncer, as this is the only string many people will ever
see.
2012-09-01 21:15:53 +02:00
Oswald Buddenhagen
d4c786823d replace FSF address with something more ... contemporary 2012-09-01 21:15:53 +02:00
Oswald Buddenhagen
6d49c343fc use a hash table for message => sync record lookup
this removes the pathological O(<number of sync records> * <number of
new messages>) case at the cost of being a bit more cpu-intensive (but
O(<number of all messages>)) for old messages.
2012-09-01 21:15:53 +02:00
Oswald Buddenhagen
dfd7516b9a introduce ability to flatten the hierarchy of Stores 2012-09-01 21:15:52 +02:00
Oswald Buddenhagen
343f16771a don't crash when select() on master fails synchronously
svars->drv[S] would not be initialized yet, so cancel_sync() would
crash.
2012-09-01 21:15:08 +02:00
Oswald Buddenhagen
28cccf4b35 fix error handling of invalid SyncState *
when we find that the store is incompatible with in-store sync state,
we want to fail the whole channel. however, we must not claim that the
store died, otherwise it won't be disposed of properly.
2012-09-01 21:15:08 +02:00
Oswald Buddenhagen
9c86ec3442 employ alternative scheme to finding messages by TUID
instead of SEARCHing every single message (which is slow and happens to
be unreliabe with M$ Exchange 2010), just FETCH the new messages from
the mailbox - the ones we just appended will be amongst them.
2012-09-01 21:15:07 +02:00
Oswald Buddenhagen
b4cef554fc clearer debug msg 2012-09-01 21:15:07 +02:00
Oswald Buddenhagen
7c815538ab fix line wrapping before info messages
unless an info message is explictly marked as a continuation, it must
terminate any pending line (typically the progress information) first.

debug output is not affected, as it is mutually exclusive with info
output, and no debug lines are left unterminated outside clear scopes.
2012-09-01 21:15:07 +02:00
Oswald Buddenhagen
6b3b6f12bb centralize flushing of unfinished debug lines 2012-09-01 21:15:07 +02:00
Oswald Buddenhagen
d2bed4990d unify error reporting
- introduce sys_error() and use it instead of perror() and
  error(strerror()) in all expected error conditions
- perror() is used only for "something's really wrong with the system"
  kind of errors
- file names, etc. are quoted if they are not validated yet, so e.g. an
  empty string becomes immediately obvious
- improve and unify language
- add missing newlines
2012-09-01 21:15:07 +02:00
Oswald Buddenhagen
584e51ed7d docs
- insert "separator comments" between driver entry points
- document driver API
- document sync_vars_t parts that are stored in the sync state header
2012-09-01 16:03:35 +02:00
Oswald Buddenhagen
e5d323cc47 rely on the maildir's existence with "SyncState *"
now that we open the box first, we know that it will exist at this
point.
2012-09-01 16:03:35 +02:00