[Time] Name | Message |
[07:12] hellophp
|
hi
|
[07:14] hellophp
|
chat
|
[07:14] hellophp
|
kk
|
[07:14] hellophp
|
test
|
[08:13] nettok
|
is there something like "getpeername" for zeromq sockets?
|
[08:13] guido_g
|
no
|
[08:13] nettok
|
ok thanks
|
[11:40] Ogedei
|
so zmq_send seems to clear my msg struct's data -- yet the docs don't talk about it, and some examples in the guide seem to assume it doesn't
|
[11:40] mikko
|
hi
|
[11:41] Ogedei
|
the question then, is: what happens to a msg struct when it is passed to zmq_send?
|
[11:43] sustrik
|
it's cleared
|
[11:43] sustrik
|
which examples in the guide assume it's not?
|
[11:45] Ogedei
|
ah, found it (you cannot send the same message twice). yet, the topic_msg in an example in the guide seems to be reused
|
[11:45] Ogedei
|
hah
|
[11:45] Ogedei
|
ah, that's an example of broken code
|
[11:45] Ogedei
|
a big red flashing warning would be good
|
[11:46] pieterh
|
Ogedei, :-) Just saw that myself
|
[11:47] pieterh
|
It does say: "Note than when you have passed a message to zmq_send(3), ÃMQ will clear the message, i.e. set the size to zero. You cannot send the same message twice, and you cannot access the message data after sending it."
|
[11:47] pieterh
|
but somewhat later
|
[11:47] Ogedei
|
so, can I re-initialize my msg struct and send it again, or should I close it and make a new one?
|
[11:47] pieterh
|
whatever is neater for your code
|
[11:47] sustrik
|
there's also zmq_msg_copy function
|
[11:48] Ogedei
|
doing msg_init on a zero-length, but unclosed message won't leak memory, right?
|
[11:48] pieterh
|
that's if you want to send the identical message to multiple sockets or more than once to the same socket
|
[11:48] pieterh
|
Ogedei, it's safe
|
[11:48] Ogedei
|
what if it is not zero-length? will it be magically closed?
|
[11:48] pieterh
|
yes
|
[11:48] Ogedei
|
awesome
|
[11:48] pieterh
|
indeed it is :-)
|
[11:49] sustrik
|
wait a sec, magically closed, what?
|
[11:49] Ogedei
|
whether msg-init (and friends) will close their argument, if necessary
|
[11:49] Ogedei
|
actually
|
[11:49] pieterh
|
sustrik: question was, if you do zmq_msg_init on an existing message will it close the old message or not
|
[11:49] Ogedei
|
how could they do that? it might be raw memory
|
[11:50] sustrik
|
exactly
|
[11:50] sustrik
|
they don't
|
[11:50] pieterh
|
what if there's a free function provided?
|
[11:50] sustrik
|
it's called when message is zmq_close'd
|
[11:50] sustrik
|
zmq_msg_close'd
|
[11:50] pieterh
|
and if you re-init the message it'll leak?
|
[11:50] sustrik
|
yes
|
[11:50] pieterh
|
Ogedei, sorry, I was optimistic
|
[11:50] Ogedei
|
I guess I'll take care to close my messages then
|
[11:50] sustrik
|
as any other C structure
|
[11:51] pieterh
|
sustrik: C structures don't really behave anything like this
|
[11:51] sustrik
|
struct {char *a; char *b};
|
[11:51] pieterh
|
structures don't have constructors
|
[11:51] sustrik
|
when are the strings deallocated?
|
[11:52] sustrik
|
yes, you need an initialisation and deinitialisation functions
|
[11:52] sustrik
|
that's what zmq_msg_init and zmq_msg_close are
|
[11:52] pieterh
|
any particular reason you _don't_ call close when re-initializing a message?
|
[11:52] pieterh
|
i assume close is idempotent
|
[11:52] sustrik
|
because you have no idea whether you are re-initialising
|
[11:53] pieterh
|
and the msg api does have callbacks for deallocators
|
[11:53] Ogedei
|
you can't just follow the pointers in there -- might be uninitialized raw memory
|
[11:53] pieterh
|
ah, it could just be random data...
|
[11:53] sustrik
|
yes
|
[11:53] Ogedei
|
anyway, did anyone ever get anywhere with a ruby wrapper? I'm writting a wrapper for (the also green-threaded) Allegro Common Lisp
|
[11:53] pieterh
|
because you're not constructing the message, it's just on the stack
|
[11:54] pieterh
|
hmm...
|
[11:54] Ogedei
|
there were some discouraging messages on the list
|
[11:54] sustrik
|
right, there may be garbage inside
|
[11:54] sustrik
|
what's wrong with the ruby wrapper?
|
[11:54] Ogedei
|
there's this issue with blocking APIs and green threads
|
[11:54] pieterh
|
Ogedei, what list ? :-)
|
[11:54] Ogedei
|
lemme search
|
[11:55] sustrik
|
i would guess it's about the fact that ruby, being green-threaded, cannot use blocking calls
|
[11:56] sustrik
|
if so, it's a problem with ruby rather than with 0mq
|
[11:56] sustrik
|
you can still use non-blocking calls
|
[11:56] Ogedei
|
yeah, but then you can say good-bye to responsiveness (that's an exaggeration of course)
|
[11:57] sustrik
|
yes, unfortunately, it's a ruby issue
|
[11:57] sustrik
|
there's little i can do about it
|
[11:58] sustrik
|
green-threads are simply not good for high-perf scenarios
|
[11:58] sustrik
|
anyway, that python guys are doing, afaiu, is that they launch several instances of python interpreter
|
[11:59] sustrik
|
each running exactly one green thread
|
[11:59] Ogedei
|
i have a trick with companion C threads and a pipe that works reasonably well, but yeah, it's awkward
|
[11:59] sustrik
|
which means it can use blocking calls
|
[11:59] sustrik
|
then they use 0mq to send messages between the instances
|
[12:00] sustrik
|
i think no trick would help: either you use non-blocking calls, thus loosing performance or you use blocking calls thus eventually blocking other green threads :(
|
[12:05] Ogedei
|
well, no, in my case, i have a thread written in C which does the blocking, and the Lisp runtime is listening on a pipe for events, and thus gets notified when the C thread has done its work. more indirect, but no polling is involved
|
[12:10] sustrik
|
ah, you can do that with 0mq as well
|
[12:10] sustrik
|
you can poll on sockets
|
[12:12] sustrik
|
in a separate C thread
|
[12:12] sustrik
|
however, the problem is how to notify ruby
|
[12:12] sustrik
|
you are back to the same problem
|
[12:12] sustrik
|
ruby can either check for new events, thus loosing performance
|
[12:13] sustrik
|
or block waiting for them, thus blocking other green-threads
|
[12:37] CIA-20
|
zeromq2: 03Martin Sustrik 07master * rde93f63 10/ configure.in :
|
[12:37] CIA-20
|
zeromq2: crypto library is needed on HP-UX to generate UUIDs
|
[12:37] CIA-20
|
zeromq2: Signed-off-by: Martin Sustrik <sustrik@250bpm.com> - http://bit.ly/bUi6DJ
|
[12:37] Guthur
|
Ogedei, It might be good if you extended the current CL binding with some read time conditionals for ACL
|
[12:38] mato
|
sustrik: shouldn't that have gone onto maint also?
|
[12:38] Guthur
|
if possible...
|
[12:38] mato
|
sustrik: (that == HP-UX -lcrypto)
|
[12:40] sustrik
|
mato: it's probably broken anyway
|
[12:40] mato
|
sustrik: why so?
|
[12:41] sustrik
|
what i mean, there are issues compiling it, right?
|
[12:41] sustrik
|
so people are presumably not using it
|
[12:41] sustrik
|
so, let them rather start with 2.1
|
[12:41] mato
|
sustrik: as you wish
|
[12:41] sustrik
|
instead of using 2.0 and then having to upgrade to 2.1 in a short time
|
[12:42] mato
|
sustrik: for the OPEN_MAX thing, just add in a define defining it to _POSIX_OPEN_MAX if it's not there
|
[12:42] sustrik
|
nope, the macros have differnet semantics
|
[12:42] mato
|
they do?
|
[12:42] sustrik
|
check POSIX
|
[12:42] sustrik
|
{OPEN_MAX}
|
[12:42] sustrik
|
Maximum number of files that one process can have open at any one time.
|
[12:42] sustrik
|
Minimum Acceptable Value: {_POSIX_OPEN_MAX}
|
[12:44] mato
|
sustrik: hmm, well, then it's a bug in HP-UX
|
[12:44] mato
|
sustrik: or they need to find out what the correct value to use is
|
[12:44] sustrik
|
not even that"
|
[12:44] sustrik
|
POSIX says:
|
[12:44] sustrik
|
"A definition of one of the symbolic names in the following list shall be omitted from <limits.h> on specific implementations where the corresponding value is equal to or greater than the stated minimum, but is unspecified.
|
[12:44] sustrik
|
This indetermination might depend on the amount of available memory space on a specific instance of a specific implementation. The actual value supported by a specific instance shall be provided by the sysconf() function."
|
[12:44] mato
|
sustrik: sure, usual POSIX ambiguity
|
[12:44] sustrik
|
i would do it in following way:
|
[12:44] mato
|
sustrik: what I mean is someone at HP has to say what value should go in there on HP-UX
|
[12:44] sustrik
|
on Solaris use OPEN_MAX-1
|
[12:45] sustrik
|
on other paltforms use our constand defined in config.hpp
|
[12:45] sustrik
|
"max_io_events"
|
[12:45] mato
|
sustrik: not a great idea; /dev/poll is platform-specific anyway
|
[12:45] mato
|
sustrik: so that platform-specific value should be determined
|
[12:45] sustrik
|
that's what i've proposed
|
[12:46] DerGuteMoritz
|
is the Ruby green thread blocking discussion from earlier still current?
|
[12:46] sustrik
|
looks like solaris has this limit of "at most OPEN_MAX"
|
[12:46] sustrik
|
hp-ux doesn't seem to have the limit
|
[12:46] sustrik
|
anyway, i'll ask brett to test it
|
[12:47] sustrik
|
DerGuteMoritz: yes, nothing have changed in the meantime :)
|
[12:47] mato
|
ah, right, you're saying that HP-UX has no limit
|
[12:47] mato
|
that's possiblew
|
[12:48] mato
|
bah, I can't seem to find a copy of the /dev/poll (poll(7d)) manpage for HP-UX on the net anywhere
|
[12:48] mato
|
sustrik: anyhow, check with brett
|
[12:48] mato
|
sustrik: and/or get him to ask their devs
|
[12:51] DerGuteMoritz
|
I don't know what Ruby provides something like that but Chicken's green thread scheduler provides a hook to wait for i/o on a file descriptor without blocking other threads. I use it successfully with ZMQ_FD
|
[12:51] DerGuteMoritz
|
works with 2.1 only then, of course
|
[12:52] sustrik
|
DerGuteMoritz: maybe Ruby can do the same thing
|
[12:52] DerGuteMoritz
|
mayhaps!
|
[12:52] sustrik
|
it's up to Ruby binding maintainers though
|
[12:53] mato
|
sustrik: ok, so, event.set() and event.reset() should never return EINTR, right?
|
[12:53] mato
|
sustrik: only event.wait() should do so
|
[12:53] DerGuteMoritz
|
yeah :-)
|
[12:53] sustrik
|
mato: it can be that way
|
[12:53] DerGuteMoritz
|
just thought I'd mention it :-)
|
[12:53] mato
|
sustrik: the other two should just silently restart the call since it *must* succeed
|
[12:53] sustrik
|
the whole EINTR thing is a heurstic anyway
|
[12:53] mato
|
ok, i'll go with this approach for now
|
[12:53] sustrik
|
ok
|
[12:53] mato
|
and we'll see what happens
|
[13:07] mikko
|
mato: http://webcache.googleusercontent.com/search?q=cache:XwnPSWMLfJkJ:docs.hp.com/en/B3921-90010/poll.7.html+hp-ux+dev/poll&cd=1&hl=en&ct=clnk&client=firefox-a
|
[13:07] mikko
|
is that the one?
|
[13:10] mikko
|
ah
|
[13:10] mikko
|
http://h20000.www2.hp.com/bc/docs/support/SupportManual/c02263385/c02263385.pdf
|
[13:10] mikko
|
poll(7) as pdf
|
[13:11] mato
|
mikko: hey, thanks! yeah, that's it
|
[13:12] mikko
|
mato: http://h20000.www2.hp.com/bizsupport/TechSupport/CoreRedirect.jsp?redirectReason=DocIndexPDF&prodSeriesId=4256918&targetPage=http%3A%2F%2Fbizsupport2.austin.hp.com%2Fbc%2Fdocs%2Fsupport%2FSupportManual%2Fc02456334%2Fc02456334.pdf
|
[13:12] mikko
|
"HP-UX Reference - Clickable Manpage Index for HP-UX 11i v3 (September 2010 Update)"
|
[13:13] mato
|
sustrik: ok, it would seem that the HP-UX /dev/poll doesn't mention any explicit limit
|
[13:13] mato
|
sustrik: over and above that of EMFILE/ENFILE obviously
|
[13:21] sustrik
|
mato, mikko: ack
|
[13:29] CIA-20
|
zeromq2: 03Martin Sustrik 07master * rd4a4106 10/ src/devpoll.cpp :
|
[13:29] CIA-20
|
zeromq2: HP-UX has no OPEN_MAX defined
|
[13:29] CIA-20
|
zeromq2: devpoll_t used this constant to determine how many events to
|
[13:29] CIA-20
|
zeromq2: retrieve from the poller in one go. The implementation was
|
[13:29] CIA-20
|
zeromq2: changed not to depend on this constant.
|
[13:29] CIA-20
|
zeromq2: Signed-off-by: Martin Sustrik <sustrik@250bpm.com> - http://bit.ly/bYvpwB
|
[13:50] mato
|
sustrik: ok, so i found that problem, just a simple mistake in my code
|
[13:50] mato
|
sustrik: one code path was reading twice...
|
[13:50] mato
|
sustrik: now i have a different problem, i think
|
[13:50] mato
|
sustrik: which has to do with the event semantics
|
[13:51] mato
|
sustrik: what i see now is a hang on context termination...
|
[13:54] sustrik
|
re
|
[13:54] sustrik
|
what specifically?
|
[13:57] mato
|
sustrik: i think what i'm seeing is that the i/o threads are not getting the event signaled when it should be
|
[13:57] mato
|
sustrik: ... this is not clear ...
|
[13:58] mato
|
sustrik: if the event is signaled only when ypipe_t flush() returns false, won't that result in missed events?
|
[13:58] mato
|
sustrik: it's not clear to me how ypipe_t "knows" that the reader is asleep/polling
|
[14:02] sustrik
|
when reader tries to get a command
|
[14:02] sustrik
|
and there is no command available
|
[14:02] sustrik
|
an atomic variable is set to null
|
[14:02] sustrik
|
when writer writes a command and finds out that the atomic variable is set to null
|
[14:03] sustrik
|
it notifies the writer by returning false
|
[14:08] mato
|
sustrik: hmm, might the problem be because there is a case where the caller of the signaller does not process *all* commands?
|
[14:09] mato
|
sustrik: my trivial debugging seems to show that the signaler is written to, but it doesn't set the event, presumably because the writer thinks the reader is still alive
|
[14:09] mato
|
sustrik: which could happen if the reader does not read *all* commands
|
[14:09] mato
|
sustrik: correct?
|
[14:13] sustrik
|
yes
|
[14:13] mato
|
but i'm probably missing something
|
[14:28] mato
|
sustrik: ok, event.set() in the signaler constructor is a neat trick
|
[14:28] mato
|
sustrik: things almost work
|
[14:28] sustrik
|
almost?
|
[14:28] mato
|
test_shutdown_stress fails rather interestingly
|
[14:29] mato
|
due to ~mutex_t() from ~signaler_t() trying to destroy a mutex that is locked... :-)
|
[14:29] mato
|
so someone is still trying to send to that signaler...
|
[14:29] mato
|
this might be related to that other problem reported on the ML
|
[14:30] sustrik
|
interesting
|
[14:30] sustrik
|
can you push to the github
|
[14:31] sustrik
|
so that i can check that
|
[14:31] sustrik
|
?
|
[14:31] mato
|
guess so, d'you want it with my debug code which prints various bits about what the signaler is doing?
|
[14:32] mato
|
sustrik: ?
|
[14:33] sustrik
|
probably not
|
[14:33] mato
|
hmm, ok give me a moment to stash it away
|
[14:33] sustrik
|
with shutdown stress that it would be a lot of printfs
|
[14:33] mato
|
doesn't really matter, the offending sequence aborts anyway
|
[14:33] mato
|
but ok, i'll take it out
|
[14:42] CIA-20
|
zeromq2: 03Martin Lucina 07wip-signaler * rabf6d73 10/ (4 files):
|
[14:42] CIA-20
|
zeromq2: Move signaled into event_t as atomic counter
|
[14:42] CIA-20
|
zeromq2: Moved signaled from signaler_t into event_t and made event_t::set() and
|
[14:42] CIA-20
|
zeromq2: event_t::reset() methods idempotent.
|
[14:42] CIA-20
|
zeromq2: Made event_t::wait() handle EINTR, except for eventfd implementation (for now).
|
[14:42] CIA-20
|
zeromq2: Signed-off-by: Martin Lucina <mato@kotelna.sk> - http://bit.ly/bNpvjo
|
[14:42] CIA-20
|
zeromq2: 03Martin Lucina 07wip-signaler * r3356bd2 10/ (4 files):
|
[14:42] CIA-20
|
zeromq2: WIP: Make signaler use ypipe_t for queueing
|
[14:42] CIA-20
|
zeromq2: Signed-off-by: Martin Lucina <mato@kotelna.sk> - http://bit.ly/8ZMJLw
|
[14:42] mato
|
sustrik: ok, try it out
|
[14:42] sustrik
|
thx
|
[14:42] mato
|
sustrik: there are actually two problems
|
[14:43] mato
|
sustrik: you will get asserts from the other tests on queue.read(), so it looks like i'm losing events somewhere, will check that
|
[14:43] mato
|
sustrik: suggest you just look at what test_shutdown_stress is doing
|
[14:43] mato
|
sustrik: oh, and i've not tested the eventfd implementation, so it might not work at all (it definitely does not process EINTR)
|
[14:43] mato
|
i'll look at the problem with events getting lost
|
[14:45] sustrik
|
ok
|
[14:51] mato
|
sustrik: ok, my fault with the event problem; atomic_counter is not ideal since I actually need atomic *set*
|
[14:52] mato
|
sustrik: sorry, i mean with the problem with losing events
|
[14:52] sustrik
|
ah
|
[14:52] mato
|
i'm using add and sub, i thought they can't happen multiple times, but it seems they can
|
[14:52] mato
|
so i really need set
|
[14:53] sustrik
|
there's some old code using lock;xchg
|
[14:53] sustrik
|
let me find it
|
[14:53] mato
|
sustrik: i'll make it work, don't worry about it
|
[14:54] sustrik
|
ok
|
[14:54] mato
|
sustrik: i can just make the set() method of atomic_counter work
|
[14:54] mato
|
sustrik: work atomically that is
|
[15:09] travlr
|
sustrik: hey martin, just noticed your message, something i can help you with?
|
[15:10] sustrik
|
i've noticed that your online documentation is not linked from the website
|
[15:10] sustrik
|
or is it?
|
[15:10] travlr
|
i think its on the "source" page at the end of the intro paragraph
|
[15:11] sustrik
|
ah, ok, i see
|
[15:11] sustrik
|
anyway, i'm wiriting an architecture overview, so i'll link it from there as well
|
[15:12] travlr
|
cool, if you want to eventually convert the sources to using doxygen style comments let me know.
|
[15:13] mato
|
sustrik: um, the eventfd code is completely bogus, sorry
|
[15:13] mato
|
sustrik: i'll fix it later, am trying to figure out what i'm doing wrong right now
|
[15:14] sustrik
|
ok
|
[16:10] sustrik
|
mato: still there?
|
[16:11] sustrik
|
i think i've found the problem with shutdown stress test
|
[16:11] mato
|
sustrik: yes, i'm fighting with the event stuff
|
[16:11] mato
|
sustrik: it's complete black magic, i don't understand what i'm doing wrong with the synchronization
|
[16:12] sustrik
|
shutdown stress test ->
|
[16:12] sustrik
|
the sender sends a command
|
[16:12] sustrik
|
before it gets chance to unlock the mutex
|
[16:13] sustrik
|
the receiving thread reads the command
|
[16:13] sustrik
|
processing the command causes destruction of the object
|
[16:13] sustrik
|
and here we are
|
[16:13] sustrik
|
EBUSY
|
[16:13] mato
|
right, that'd make sense...
|
[16:14] sustrik
|
i've added sync.lock(); sync.unlock(); into the desctructor of signaler_t
|
[16:14] sustrik
|
so that it waits till the mutex is released
|
[16:14] sustrik
|
and it seems to work now
|
[16:14] mato
|
what if a writer locks it again in the mean time?
|
[16:14] mato
|
i.e. after the sync.unlock() in the destructor, but before the actual destruction?
|
[16:17] sustrik
|
there should be no more commands for an object after it shuts down
|
[16:17] sustrik
|
that's why it counts term_acks
|
[16:18] mato
|
ok
|
[16:18] sustrik
|
will you add the code?
|
[16:18] mato
|
yes
|
[16:18] sustrik
|
ok
|
[16:18] mato
|
i have a bigger problem
|
[16:18] mato
|
which is that there's something wrong with how i'm synchronizing the signaled variable
|
[16:18] mato
|
and i don't understand what it is
|
[16:19] mato
|
i've changed the code to use CAS
|
[16:19] mato
|
but i still get stuff like coming out of wait() it fails with an assertion that signaled is zero
|
[16:19] mato
|
where it should be one
|
[16:19] sustrik
|
a mutli core box?
|
[16:19] mato
|
yes
|
[16:20] sustrik
|
orgering of CAS and send/recv is ok?
|
[16:20] sustrik
|
ordering*
|
[16:20] mato
|
CAS is done first
|
[16:20] sustrik
|
in both cases
|
[16:20] sustrik
|
?
|
[16:20] mato
|
yes
|
[16:20] sustrik
|
wait a sec
|
[16:21] mato
|
hmm, i just realised i had extra left-over code in there
|
[16:21] mato
|
but it still doesn't work
|
[16:22] mato
|
sustrik: in event.set() i have
|
[16:22] mato
|
if (signaled.cas (0, 1) == 1)
|
[16:22] mato
|
return;
|
[16:22] mato
|
and then send
|
[16:22] mato
|
after the send i assert that it's still 1
|
[16:23] mato
|
which fails, presumably because the reader has since re-set it
|
[16:23] sustrik
|
that's ok, no?
|
[16:23] mato
|
that's fine
|
[16:23] mato
|
in fact, i just removed that assert
|
[16:24] mato
|
but the weird one is
|
[16:24] mato
|
after wait() i get signaled = 0
|
[16:24] sustrik
|
how does the wait code look like?
|
[16:24] mato
|
assert signaled == 0
|
[16:24] mato
|
recv
|
[16:24] mato
|
return -1 if EINTR
|
[16:24] mato
|
assert signaled == 1
|
[16:24] mato
|
that's all
|
[16:25] mato
|
it's obviously a synchronization problem since it doesn't always happen at the same "place" e.g. when running test_reqrep_tcp
|
[16:26] sustrik
|
the first assert is bogus
|
[16:26] mato
|
yes, i took it out
|
[16:26] sustrik
|
the second one fails?
|
[16:26] mato
|
sorry, which first assert
|
[16:26] sustrik
|
assert signaled == 0
|
[16:26] mato
|
you mean signaled == 0 at start of wait?
|
[16:26] sustrik
|
yes
|
[16:26] mato
|
ah, right, i can take that out
|
[16:27] mato
|
but it's the 2nd one that's failing
|
[16:28] sustrik
|
signaled ihow do you test the signaled variable?
|
[16:28] sustrik
|
exactly?
|
[16:28] mato
|
just a normal get()
|
[16:28] mato
|
shouldn't matter on x86
|
[16:28] sustrik
|
right
|
[16:31] sustrik
|
it looks are there are some leftover bytes in the socketpair
|
[16:31] mato
|
why would that be?
|
[16:31] sustrik
|
i cannot think of any other reason why this would happen
|
[16:32] sustrik
|
dunno
|
[16:32] mato
|
yes, but why would signaled=0 indicate leftover bytes in the socketpair?
|
[16:32] sustrik
|
can you psuh the code, so that i can have a look at it?
|
[16:33] sustrik
|
because wait will return even though the sender haven't sent anything
|
[16:33] sustrik
|
and thus it haven't set the flag to 1
|
[16:34] sustrik
|
so, imo, it looks like the sender is sending signal to receiver twice
|
[16:34] sustrik
|
even though receiver was stuck only once
|
[16:34] mato
|
sustrik: sure, but the cas() in event.set() should protect it from running more than once...
|
[16:35] sustrik
|
it's hard to reason about without seeing the code
|
[16:36] sustrik
|
i think i know what the problem is
|
[16:36] mato
|
?
|
[16:36] sustrik
|
the reader doesn't do read&reset as an atomic unit locked by mutex
|
[16:36] sustrik
|
so the cas on the sender side can happen between the two
|
[16:37] sustrik
|
maybe
|
[16:37] sustrik
|
i need to see the code
|
[16:37] mato
|
i'll email it to you, i don't want to pollute the main git with WIP stuff like this
|
[16:37] sustrik
|
ok, just send it
|
[16:39] mato
|
sent
|
[16:39] sustrik
|
thx
|
[16:47] jdroid-
|
this might be naive... but what is the zeromq response to someone who says, "I need a broker?"
|
[16:48] sustrik
|
use a device
|
[16:48] sustrik
|
mato: i think what you need to do is this:
|
[16:48] jdroid-
|
mind elabroating? does zmq recommend certain devices? how do they work?
|
[16:49] sustrik
|
move the CAS in reset *after* the recv
|
[16:50] mikko
|
jdroid-: http://zguide.zeromq.org/chapter:all#toc29
|
[16:50] sustrik
|
jdroid-: there are several devices shipped with 0mq itself (queue, forwarder, streamer)
|
[16:50] sustrik
|
but you can build new ones by hand, it's easy
|
[16:50] jdroid-
|
ohh.. weird. i thought you meant hardware
|
[16:51] mato
|
sustrik: hmm, ok, but at the same time reset() must be idempotent
|
[16:51] sustrik
|
does it?
|
[16:51] mato
|
sustrik: so we should not do the recv() if signaled is 1...
|
[16:51] jdroid-
|
sometimes people suggest that zeromq isn't actually a queue. eo these devices make that so then?
|
[16:52] sustrik
|
mato: you mean 0?
|
[16:52] mato
|
sustrik: yes
|
[16:52] sustrik
|
that should be guaranteed by semantics of ypipe i would say
|
[16:53] sustrik
|
jdroid-: it's not a queue, it's a toolkit to build queueing systems
|
[16:53] mato
|
sustrik: no, that doesn't help
|
[16:54] mato
|
sustrik: the problem is more complex than that it seems
|
[16:54] jdroid-
|
sustrik: i see. are there any projects that could help me understand what's involved with building a queue in zmq?
|
[16:54] sustrik
|
mato: what have you done?
|
[16:54] mato
|
sustrik: moved the CAS
|
[16:54] mato
|
sustrik: still fails on the assert coming out of wait()
|
[16:55] sustrik
|
jdroid: read the guide, that's the best way to understand how the whole thing works
|
[16:55] sustrik
|
mato: same assert?
|
[16:55] mato
|
sustrik: yes
|
[16:55] jdroid-
|
sustrik: fair.
|
[16:58] sustrik
|
mato: bleh, we'll need to use the mutex on recv side as well
|
[16:58] sustrik
|
there's no other way to make the socketpair and the flag behave as an atomic object
|
[16:59] sustrik
|
so, try this:
|
[16:59] sustrik
|
forget about atomic ops
|
[16:59] sustrik
|
simply make signaled a bool
|
[16:59] sustrik
|
and set/reset it from inside of the critical section
|
[16:59] sustrik
|
the same one you use to send/recv on the sockepair
|
[18:27] idefine
|
is there documentation for scaling to multiple machines?
|
[18:35] cremes
|
idefine: not sure what you are asking for; have you looked at the "tcp" and "pgm" transports?
|
[18:42] stephank
|
Bindings related question, because a couple of us node.js folks are working on this. We're not really following the binding guidelines, because node.js' socket API isn't traditional either. Multi-part messages were just implemented as `sock.send(part1, part2);` and I was wondering if we were missing important use-cases.
|
[18:42] stephank
|
For example, would it make sense to do any significant (possibly async) work between submitting message parts, or the like?
|
[18:43] stephank
|
Or is multi-part's intention really only to submit logically separate parts to a REQ, like say HTTP headers and body separate?
|
[18:52] cremes
|
stephank: i think there are probably a few cases where submitting message parts async would be useful
|
[18:52] cremes
|
imagine a scenario where each part has different computational requirements, for example
|
[18:53] cremes
|
that being said, i don't think the message is transmitted until all parts are "sent" via the api
|
[18:53] pieterh
|
stephank, multi-part's intention is to provide an easy way to subframe a message
|
[18:53] cremes
|
so from that perspective, you aren't saving any time
|
[18:53] pieterh
|
e.g. to create routing envelopes
|
[18:53] pieterh
|
(see chapter 3 of the Guide)
|
[18:53] pieterh
|
and to allow zero-copy on subframes, independently
|
[18:54] pieterh
|
i.e. write a message envelope from one buffer and a message body from another, without copy
|
[18:54] pieterh
|
there are no use cases I've seen where apps do real work in between sending or receiving message parts
|
[18:55] pieterh
|
you cannot sensibly do any logic (loops, conditional) between parts of a single message
|
[18:56] pieterh
|
maybe in some cases, e.g. using the contents of a header to know how many parts to read
|
[18:56] pieterh
|
hth
|
[18:56] stephank
|
Good points. So, would it make sense to then say, leave the queueing to the user?
|
[18:57] pieterh
|
queuing of message parts?
|
[18:57] pieterh
|
the best model I've found is (a) build message from parts (b) send it
|
[18:57] pieterh
|
and (c) read all message parts into one structure
|
[18:57] stephank
|
That's essentially what ÃMQ does while you are submitting, right?
|
[18:57] pieterh
|
sure
|
[18:58] pieterh
|
you cannot interleave message parts from different messages
|
[18:58] pieterh
|
I'd suggest either keeping the semantics of 'send part + flag to indicate final'
|
[18:59] pieterh
|
or else providing a two step 'construct multipart message' and 'send message' semantic
|
[18:59] pieterh
|
actually sock.send (part1, part2, part3) looks pretty decent
|
[19:00] pieterh
|
but it's not orthogonal with sock.recv
|
[19:00] stephank
|
I think I might be missing something. The only thing I can think of that's missing in an API that demands all parts in a single function call is that there's no access to ÃMQ's queueing that's happening in the background.
|
[19:00] pieterh
|
'all parts
|
[19:00] pieterh
|
'all parts' is an open ended list?
|
[19:00] stephank
|
open ended?
|
[19:01] pieterh
|
sorry, I'm not familiar with JS syntax here
|
[19:01] pieterh
|
can you specify a variable number of arguments?
|
[19:02] stephank
|
Yes, javascript is actually very loose when it comes to function arguments. Node.js is an entirely async environment, and the way it is implemented now is to send multi-part messages using "send(part1, part2, ...);" and receive using a handler that might look like "function(part1, part2, ...) { ... }"
|
[19:03] pieterh
|
so send will work fine
|
[19:03] pieterh
|
but recv will be difficult
|
[19:03] stephank
|
But in JS, a function could potentially be defined as "function() { ... }", but still access it's arguments using a special "arguments" array. So even if the receiver does not know the number of parts ahead, it can still inspect them.
|
[19:03] pieterh
|
since you do not know in advance how many parts you will get
|
[19:03] pieterh
|
you really need to receive into a single structure
|
[19:03] pieterh
|
an array of parts, for example
|
[19:04] pieterh
|
and then you'd naturally send from the same structure
|
[19:04] pieterh
|
so that you recv and send the same data types
|
[19:04] stephank
|
Yes, that's what's happening in the current implementation. It simply builds a javascript array as it's receiving parts, and only calls the handler at the very end.
|
[19:05] pieterh
|
what's the 'handler' and is is 'receiving' parts from the caller or the socket?
|
[19:05] pieterh
|
unclear, sorry
|
[19:06] stephank
|
No problem. :) The handler is the function that will be called from the event loop on the receiving end. Receiving parts was meant as receiving from a socket.
|
[19:06] pieterh
|
ack
|
[19:06] pieterh
|
so it passes an array to the handler
|
[19:06] stephank
|
exactly. :)
|
[19:06] pieterh
|
and the handler can send the same array back out to another socket
|
[19:07] stephank
|
It can, yes
|
[19:07] pieterh
|
try, for fun, implementing some of the examples in the guide,
|
[19:07] pieterh
|
that will help you understand the semantics
|
[19:07] stephank
|
That's a good plan. :)
|
[19:07] stephank
|
We actually lack a decent test suite atm. This is all early work.
|
[19:08] stephank
|
(and it depends on unstable versions of both ÃMQ and node.js)
|
[19:08] pieterh
|
the examples form a decent test of the API syntax
|
[19:08] pieterh
|
especially when you get to more sophisticated apps from ch2 and ch3
|
[19:08] pieterh
|
even in C they need abstractions like the zmsg class
|
[19:09] Guthur
|
are bindings generally avoiding providing high level abstractions of the C API?
|
[19:09] stephank
|
I'll have a closer look soon. Writing these bindings was actually my way of learning the API.
|
[19:10] pieterh
|
Guthur, that's up to binding authors... I suspect some of the high level abstractions will get reused
|
[19:10] stephank
|
pieterh: Thanks for thinking a long. Much appreciated. :)
|
[19:10] Guthur
|
ok, so there is some leeway for binding authors
|
[19:10] stephank
|
and cremes ;)
|
[19:10] pieterh
|
Guthur, yes, and they already do that, especially for send/recv which are very low level in C
|
[19:11] pieterh
|
stephank, np
|
[19:13] Guthur
|
pieterh, Ok, it's just that I have adopted the C# binding and would like to provide a few higher level abstractions
|
[19:27] pieterh
|
Guthur, I think it's worth trying to make abstractions other people can/will reuse
|
[19:27] pieterh
|
which means writing some analysis upfront and publishing that
|
[19:28] pieterh
|
for example I'm interested in building up the abstraction layer in C, being ZFL
|
[19:59] Guthur
|
pieterh, too be honest I thinking mostly of some accessors for sockopts
|
[19:59] Guthur
|
After that I will do work through the examples more and see if anything else comes to me
|
[20:17] pieterh
|
Guthur, sounds good
|
[20:17] pieterh
|
I think the Java binding also does that
|
[20:55] cremes
|
stephank: if you're a node.js guy, you might want to check out zmqmachine (github) which is my 0mq Ruby reactor
|
[20:55] cremes
|
stephank: i only mention it because i read long ago that node.js was influenced by eventmachine
|
[21:38] idefine
|
j #jira
|
[21:42] stephank
|
cremes: Oh, wouldn't know, but sounds possible.
|
[21:42] stephank
|
cremes: I'm looking at Req#send_messages, and I believe we're on the same page there. :)
|
[21:45] stephank
|
I'm wondering if it's worthwhile to do, but it'd definitely be neater if we had a direct binding, and implemented all the js and node-js magic on top of that in js itself.
|
[21:45] stephank
|
Right now, a lot of that magic is in C++, which may be cause of some mysterious stability issues
|
[21:47] stephank
|
Probably is worthwhile :)
|
[21:47] jhawk28
|
hello all
|
[21:47] stephank
|
hi
|
[21:56] Guthur
|
is there any clrzmq users around?
|
[22:13] jhawk28
|
Guthur: sorry - mostly jave for me
|
[22:22] Guthur
|
maybe the zmq mailing list to see if any users fall out of the wood work
|
[22:23] Guthur
|
I'd like to make a raft of changes to the binding but would rather some feedback from users before committing them
|
[22:23] Guthur
|
They are not backwards compatible which makes it more difficult
|