[Time] Name | Message |
[03:20] stoneman
|
hello
|
[03:20] stoneman
|
is anyone here?
|
[03:27] dbudworth
|
Trying to figure out if 0mq is appropriate for my project, simple description would be something like a sticky load balancer. once a client talks to a service, it sticks with that one. for a bi-directional conversation between 2 nodes once an initial node has been selected. ie: my client does round robin on each new conversation but updates to a given conversation have to stick to the beginning server
|
[03:27] dbudworth
|
and hello stoneman, i'm here. but basically not useful if you are looking for help
|
[09:10] mato
|
sustrik: i was just thinking about the zmq_term()/zmq_close() semantics
|
[09:10] mato
|
sustrik: and i came up with one small detail
|
[09:11] mato
|
sustrik: which may or may not be helpful in the case where zmq_term() would wait indefinitely on exit
|
[09:11] mato
|
sustrik: the thing is, syscall close() means "commit".
|
[09:12] mato
|
sustrik: if you don't close your filedescriptors, and don't use fsync() (not for sockets obviously), then there is no commitment on the part of the kernel
|
[09:13] mato
|
sustrik: hence, as far as sockets with data in-flight at zmq_term() time are concerned, we only need care about those that zmq_close() has actually been called on
|
[09:13] mato
|
sustrik: for any other sockets, all bets are off
|
[09:19] sustrik
|
mato: sure, but user has to close all the sockets anyway
|
[09:19] sustrik
|
otherwise he'll end up with memory leaks
|
[09:19] mato
|
sustrik: i was thinking more of the case of "the application terminating in an abnormal way"
|
[09:20] sustrik
|
no guarantees then
|
[09:20] mato
|
sure, what i mean is...
|
[09:20] mato
|
if an application handles e.g. SIGINT/SIGTERM
|
[09:20] mato
|
should it go and call zmq_term() as part of its exit process in such a case?
|
[09:20] mato
|
or explicitly not do that, becuase the call may block...?
|
[09:21] mato
|
i guess the answer is application dependent
|
[09:21] sustrik
|
no idea
|
[09:21] mato
|
oh, and another thing
|
[09:21] mato
|
we talked about the non-zero-copy APIs
|
[09:22] mato
|
and i have the most obvious names :)
|
[09:22] mato
|
zmq_sendcopy and zmq_recvcopy ... ?
|
[09:22] mato
|
would do fine for 2.1, in 3.x the whole mess can be done the right way
|
[09:23] sustrik
|
dunno, does it makes sense to introduce something that will be changed anyway?
|
[09:23] sustrik
|
it's just asking for problems
|
[09:23] mato
|
why?
|
[09:24] mato
|
well, one reason is that pieter uses all these "helper" functions in the user guide precisely for this reason
|
[09:25] sustrik
|
so, we'll have send, sendcopy and the helper function
|
[09:25] sustrik
|
:)
|
[09:25] mato
|
no, the helpers can go away
|
[09:25] mato
|
check with pieter, but imo the reason those are there is becuase zerocopy in C is "too much typing" or something :)
|
[09:25] mato
|
at least that's what he told me
|
[09:25] sustrik
|
anyway, first thing on this path is to defined the semantics for recvcopy when the buffer is no large enough
|
[09:27] mato
|
yeah, good point
|
[09:27] mato
|
presumably some error will have to be returned ...
|
[09:27] mato
|
SCTP works in terms of messages as atomic units?
|
[09:30] sustrik
|
mato: yes
|
[13:36] user
|
was hoping someone can help me
|
[13:37] user
|
cant download anything always comes up with an error
|
[13:43] drbobbeaty
|
I was just able to download the POSIX version from the website: http://www.zeromq.org/area:download
|
[13:44] user
|
whats that?
|
[13:44] drbobbeaty
|
The POSIX version is the version for Linux, etc. There's also a Windows version on the same page.
|
[13:44] mato
|
sustrik: are you around?
|
[13:45] mato
|
sustrik: I've found a silly bug in the select() impl. of zmq_poll() on master
|
[13:45] sustrik
|
mato: here am I
|
[13:45] mato
|
sustrik: trivial fix, but something else is broken when using select()
|
[13:45] user
|
k. my error code tells me: Archive: /media/OFFICE12/setup.exe
|
[13:45] user
|
[/media/OFFICE12/setup.exe]
|
[13:45] user
|
End-of-central-directory signature not found. Either this file is not
|
[13:45] user
|
a zipfile, or it constitutes one disk of a multi-part archive. In the
|
[13:45] user
|
latter case the central directory and zipfile comment will be found on
|
[13:45] user
|
the last disk(s) of this archive.
|
[13:45] user
|
note: /media/OFFICE12/setup.exe may be a plain executable, not an archive
|
[13:45] user
|
zipinfo: cannot find zipfile directory in one of /media/OFFICE12/setup.exe or
|
[13:45] user
|
/media/OFFICE12/setup.exe.zip, and cannot find /media/OFFICE12/setup.exe.ZIP, period.
|
[13:46] mato
|
user: I'm sorry, you're probably on the wrong chat room here.
|
[13:46] user
|
where do i need to go?
|
[13:47] sustrik
|
:)
|
[13:47] sustrik
|
mato: wull?
|
[13:48] mato
|
sustrik: wull?
|
[13:48] sustrik
|
well?
|
[13:48] mato
|
hang on
|
[13:48] mato
|
i'm on bloody win32
|
[13:48] mato
|
everything is confusing :-)
|
[13:48] mato
|
patience...
|
[13:49] mato
|
sustrik: ok, 1st, somewhee around line 547 of zmq.cpp, the select () call needs to be changed to use maxfd + 1
|
[13:49] mato
|
sustrik: not just maxfd
|
[13:49] mato
|
sustrik: that's probably my fault
|
[13:49] sustrik
|
ok
|
[13:50] mato
|
sustrik: now, then, what i'm seeing is on win32 _or_ on Linux with ZMQ_FORCE_SELECT (and I patched zmq.cpp to also use select() for zmq_poll() when zmq_force_select() is defined)
|
[13:50] mato
|
sustrik: for some reason a socket is not becoming ready on the app side when it should
|
[13:51] mato
|
sustrik: i.e. data gets sent down XREP, XREQ on the client side never becomes ready
|
[13:51] sustrik
|
do you have a simple test program?
|
[13:52] mato
|
working on it...
|
[13:52] sustrik
|
ok
|
[14:20] mato
|
sustrik: ok, i have a test case... will msg you, it;s on the test box
|
[14:45] CIA-20
|
zeromq2: 03Martin Lucina 07master * r1abfc92 10/ src/zmq.cpp : minor problem in zmq_poll (select version) fixed - http://bit.ly/c9Sdnb
|
[14:55] CIA-20
|
zeromq2: 03Martin Lucina 07master * rf49b77e 10/ src/zmq.cpp : zmq_poll honours ZMQ_FORCE_POLL and ZMQ_FORCE_SELECT options - http://bit.ly/dzi76e
|
[15:16] sustrik
|
mato: afaics the problem is that ZMQ_FD is edge-trigerred
|
[15:17] sustrik
|
thus, IN/OUT flag may be set in the past, but the select/poll won't exit because of it
|
[15:18] sustrik
|
mato: check how poll version of zmq_poll works
|
[15:20] mato
|
sustrik: the code seems equivalent, no?
|
[15:20] sustrik
|
nope
|
[15:21] mato
|
then i don't understand the problem...
|
[15:21] sustrik
|
you have to delete lines 572.577
|
[15:21] sustrik
|
let me do it
|
[15:21] mato
|
which lines, i have different line numbers here...
|
[15:22] mato
|
and i'd like to understand what the problem is
|
[15:22] sustrik
|
you should _not_ check POLLIN is set on ZMQ_FD and check ZMQ_EVENTS anyway
|
[15:22] sustrik
|
whather
|
[15:22] sustrik
|
ehether
|
[15:23] sustrik
|
whether
|
[15:23] sustrik
|
:)
|
[15:23] mato
|
?
|
[15:23] sustrik
|
...whether POLLIN is set...
|
[15:24] mato
|
sustrik: hmm, so you're saying that each ZMQ_FD needs to be checked every time you come out of the select/poll() ?
|
[15:24] CIA-20
|
zeromq2: 03Martin Sustrik 07master * r4d51a52 10/ src/zmq.cpp : zmq_poll (select version) now correctly assumes that ZMQ_FD is edge-trigerred - http://bit.ly/dklauP
|
[15:25] sustrik
|
yes
|
[15:25] sustrik
|
committed
|
[15:25] sustrik
|
the test program seems to work
|
[15:26] mato
|
i still don't understand... select would not have exited if the fd did not become ready...?
|
[15:27] sustrik
|
what?
|
[15:27] mato
|
sustrik: if you're sitting in select(), and the notify fd becomes ready, then you read the events
|
[15:27] mato
|
sustrik: what is the other code path?
|
[15:28] sustrik
|
the commands in ZMQ_FD were already processed before calling zmq_poll
|
[15:28] sustrik
|
select blocks forever
|
[15:28] sustrik
|
although there are messages in available
|
[15:28] mato
|
processed by who?
|
[15:28] sustrik
|
random previous command
|
[15:29] mato
|
ah, right, this is because ZMQ_FD is tapping straight into the signaller
|
[15:29] mato
|
mumble
|
[15:29] sustrik
|
ack
|
[15:29] mato
|
i wish we could fix that
|
[15:29] sustrik
|
?
|
[15:29] sustrik
|
it's fixed
|
[15:29] mato
|
this is also why you do that first_pass thing, right?
|
[15:29] sustrik
|
yes
|
[15:29] sustrik
|
no timout on first pass
|
[15:30] sustrik
|
exit immediately
|
[15:30] sustrik
|
then check whether events are availalbe
|
[15:30] mato
|
this is to pick up events coming from previously processed commands, right?
|
[15:30] sustrik
|
yes
|
[15:30] mato
|
ok, understood
|
[15:38] sustrik
|
mato: btw, if you want to optimise it you can still perform the check when !first_pass
|
[15:39] sustrik
|
getting ZMQ_EVENTS can be rather slow as it involves reading from the signaler => recv()
|
[15:40] mato
|
right, that might be a good idea
|
[15:40] mato
|
also, is getting ZMQ_FD slow?
|
[15:41] mato
|
since that is also done more times than strictly necessary
|
[15:41] mato
|
sustrik: btw, the signaler uses socketpair?
|
[15:41] mato
|
sustrik: which translates to a tcp connection with winsock?
|
[15:44] sustrik
|
yes
|
[15:44] mato
|
interesting, i get an occasional "Address already in use" on Win32 from signaler.cpp:80
|
[15:45] sustrik
|
let me see
|
[15:45] mato
|
I guess this is just the poor M$ WIn$ock running out of tcp ports or something
|
[15:45] mato
|
it's the "connect to remote peer" call...
|
[15:45] sustrik
|
EADDRINUSE on connect?
|
[15:45] mato
|
yeah
|
[15:45] mato
|
bizarre
|
[15:46] sustrik
|
hm, it's documented
|
[15:46] mato
|
"not enough ports" ?
|
[15:47] sustrik
|
very useful description:
|
[15:47] sustrik
|
[EADDRINUSE]
|
[15:47] sustrik
|
Attempt to establish a connection that uses addresses that are already in use.
|
[15:47] sustrik
|
(that's POSIX)
|
[15:48] mato
|
sustrik: windoze is slightly different
|
[15:48] sustrik
|
Linux: EADDRINUSE
|
[15:48] sustrik
|
Local address is already in use.
|
[15:48] mato
|
http://msdn.microsoft.com/en-us/library/ms740668%28VS.85%29.aspx
|
[15:49] sustrik
|
The socket's local address is already in use and the socket was not marked to allow address reuse with SO_REUSEADDR. This error usually occurs when executing bind, but could be delayed until the connect function if the bind was to a wildcard address (INADDR_ANY or in6addr_any) for the local IP address. A specific address needs to be implicitly bound by the connect function.
|
[15:49] mato
|
Yeah, I never quite understood the SO_REUSEADDR semantics on win32
|
[15:49] sustrik
|
does it make any sense to you?
|
[15:49] mato
|
but i'll try adding that in and see if it changes anything
|
[15:50] mato
|
well, i think what it's saying is "the wildcard bind() picked a port that is still in time-wait state"
|
[15:50] mato
|
or some nonsense like that
|
[15:50] sustrik
|
:|
|
[15:50] mato
|
let me try adding SO_REUSEADDR for win32 to the signaler listen socket and see what happens
|
[15:50] mato
|
i don't care what it actually does on win32 as long as the error goes away :-)
|
[15:51] mato
|
proper win32 solution is obviously to use named pipes/win32 objects/whatsits and iocp
|
[15:54] sustrik
|
ok
|
[16:07] mato
|
hmm
|
[16:07] mato
|
i dunno
|
[16:07] mato
|
doesn't seem to help much
|
[16:07] mato
|
anyway, this doesn't really matter
|
[16:08] mato
|
i added SO_REUSEADDR to both ends of the emulated socketpair and still get EADDRINUSE back
|
[16:08] mato
|
so i think it's just poor windows running out of ports to auto-assign or something :-)
|
[16:08] mato
|
sustrik: anyway, the problem with select() has been fixed, so it's all good!
|
[16:11] sustrik
|
nice
|
[16:11] sustrik
|
you may fill in the bug report for the EADDRINUSE stuff
|
[16:24] mato
|
done
|
[20:00] cpscotti
|
Hello there, anyone up to a philosophical (although basic) discussion on a zmq networking topology for a "generic" application? Regarding many clients & services but without a broker.
|
[20:17] cremes
|
cpscotti: i'd be happy to have that conversation with you tomorrow; i have to leave for the rest of today
|
[20:19] cpscotti
|
cremes: thanks.. if nothing is solved in my side of things I'll try tomorrow then
|
[20:19] cremes
|
ok
|
[21:07] ModusPwnens
|
is the send function faster than the receive function? or are they both just as fast?
|
[21:08] cpscotti
|
send doesn't block, recv blocks
|
[21:08] cpscotti
|
(as for the "program flow" speed)
|
[21:08] cpscotti
|
now for the underlying stuff, dunno
|
[21:08] ModusPwnens
|
even with pub/sub topology?
|
[21:08] jhawk28
|
recv can do a noblock flag
|
[21:08] cpscotti
|
awl.. yep..
|
[21:08] ModusPwnens
|
I see.
|
[21:09] ModusPwnens
|
So that would explain benchmarking tests where
|
[21:09] ModusPwnens
|
i run several in a row
|
[21:09] ModusPwnens
|
http://pastie.org/1161267
|
[21:09] ModusPwnens
|
thats probably better than explaining it
|
[21:10] ModusPwnens
|
I've sort of run into a wall with this and have been stuck on it for several days
|
[21:10] ModusPwnens
|
because I am seeing those results
|
[21:10] ModusPwnens
|
and am not sure what to make of them
|
[22:41] larrytheliquid
|
are there any semantic implications to connecting multiple times to the same address?
|
[22:41] larrytheliquid
|
from the same socket, that is
|