[Time] Name | Message |
[05:44] liang2012
|
has any one encountered a send being blocked?
|
[09:28] CIA-79
|
libzmq: 03Mikko Koppanen 07master * r6c1b50c 10/ (acinclude.m4 configure.in src/ip.cpp): Added compile-time test for SOCK_CLOEXEC ...
|
[09:51] mikko
|
hi
|
[09:51] mikko
|
how was ukraine?
|
[09:52] sustrik
|
mikko: hi
|
[09:52] sustrik
|
great
|
[09:52] sustrik
|
0mq meet turned into pycon after-after-party as people leaving the after-party joined the meetup
|
[09:53] sustrik
|
some have allegedly even continued to an after-after-afte-party :)
|
[09:54] sustrik
|
btw, i've asked pieter to pull the CLOEXEC patch to 3.0 and 2.1
|
[09:55] mikko
|
cool. thanks
|
[10:04] mikko
|
sustrik: seen LIBZMQ-275?
|
[10:04] sustrik
|
yes
|
[10:04] sustrik
|
i kind of recall fixing that kind of thing already
|
[10:23] jond
|
hi Martin re: that zynga issue which Henry Geddes posted a stack trace on; another poster Marc Rossi remembered a patch he had applied which does appear to be in 2.1.10 but the code is slightly different to original patch as it handles a timeout now; this is in socket_base_t::recv. does this ring any bells?
|
[10:40] mikko
|
sustrik: might be 3.0 issue?
|
[10:41] sustrik
|
mikko: may be, i'll check it
|
[10:42] sustrik
|
jond: not really, i'm going to check
|
[10:42] sustrik
|
(beet out of town for 6 days, i'm catching up still)
|
[10:58] jond
|
sustrik: no problem, well he ends with two threads in epoll_wait (that'll be reaper, and io) and another doing poll from mailbox. mikko has asked them if they are running centos
|
[10:58] jond
|
which they are
|
[12:07] mikko
|
jond: i saw this on irc the other day
|
[12:07] mikko
|
guy was running pub/sub with large throughput
|
[12:07] mikko
|
and was having hangups
|
[12:07] mikko
|
on centos5
|
[12:58] sustrik
|
mikko: LIBZMQ-261, the kqueue problem
|
[12:58] sustrik
|
i'm running the attached test on freebsd
|
[12:58] sustrik
|
but it seems to work ok
|
[12:59] sustrik
|
do you remember whether there was some other freebsd test program?
|
[13:15] mikko
|
sustrik: which version are you testing?
|
[13:16] sustrik
|
2-1
|
[13:17] sustrik
|
mikko: latest 2-1 i mean
|
[13:18] mikko
|
sustrik: did you remove the patch?
|
[13:18] mikko
|
the one that i wrote hides the issue
|
[13:19] sustrik
|
it've been applied to 2-1?
|
[13:19] mikko
|
yes, pieter applied it
|
[13:19] sustrik
|
i see
|
[13:19] sustrik
|
let me downgrade to 2.1.10
|
[13:19] mikko
|
https://zeromq.jira.com/browse/LIBZMQ-261?focusedCommentId=13522&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13522
|
[13:20] mikko
|
2.1.9 iirc doesnt have it
|
[13:20] sustrik
|
the error was reported with 2.1.10
|
[13:20] sustrik
|
so it should be visible there
|
[13:20] mikko
|
i reported it against 2.1.0
|
[13:20] mikko
|
ermm
|
[13:21] mikko
|
2.1.10 because that was the trunk back then
|
[13:21] mikko
|
i think 2.1.10 released with this patch
|
[13:21] sustrik
|
i see
|
[13:22] sustrik
|
let's use 2.1.9 then
|
[13:22] mikko
|
you should get it out with ease on 2.1.9
|
[13:22] mikko
|
my patch hides the error and makes it workable
|
[13:22] mikko
|
probably not ideal but seems to work
|
[13:23] sustrik
|
trying...
|
[13:25] jond
|
sustrik: could kevent::rm_fd call kevent_twice?
|
[13:25] sustrik
|
no idea
|
[13:25] sustrik
|
:)
|
[13:26] sustrik
|
i haven't written the code
|
[13:26] jond
|
s/kevent/kqueue/g
|
[13:27] sustrik
|
i would say it should call it only once
|
[13:27] jond
|
well I wonder what happens if pe->flag_pollin and pe->flag_pollout are both true
|
[13:27] jond
|
as it's called twice then
|
[13:28] sustrik
|
mikko: no luck
|
[13:28] sustrik
|
jond: let me see
|
[13:29] jond
|
looks iffy to me, but i don't have a bsd system
|
[13:29] sustrik
|
jond: what line?
|
[13:30] jond
|
kqueue.cpp, zmq::kqueue_t::rm_fd
|
[13:32] sustrik
|
hm, i have no experience with kqueue
|
[13:32] sustrik
|
however, it seems that READ and WRITE events are different objects
|
[13:32] sustrik
|
and so both should be removed
|
[13:33] jond
|
sustrik: same here; can the removal be done in single call, or'ing the flags?
|
[13:34] sustrik
|
no idea
|
[13:34] sustrik
|
however, it should have no impact on this bug
|
[13:35] sustrik
|
damn, i cannot reproduce it
|
[13:35] sustrik
|
mikko: have you used a different test case or something?
|
[13:58] mikko
|
sustrik: yes
|
[13:58] mikko
|
it crashed pzq all the time
|
[13:59] sustrik
|
on freebsd, right?
|
[13:59] mikko
|
on mac
|
[13:59] mikko
|
Environment:
|
[13:59] mikko
|
Mac OS X
|
[13:59] sustrik
|
:(
|
[13:59] sustrik
|
no mac os x here
|
[14:00] mikko
|
should be similar error
|
[14:00] mikko
|
Gábor Farkas seems to have issue with push/pull on freebsd
|
[14:00] mikko
|
let me test if the test code in the issue does it on mac
|
[14:01] sustrik
|
it's attached by someone with osx so presumably it should
|
[14:01] sustrik
|
what about trying pzq on freebsd?
|
[14:03] mikko
|
the test case looks like a simplified version of what pzq does
|
[14:03] mikko
|
i wonder if this is os x specific?
|
[14:03] sustrik
|
maybe
|
[14:03] mikko
|
or at least harder to reproduce on freebsd
|
[14:03] sustrik
|
still, gabor reports something similar
|
[14:03] sustrik
|
yes, harder to reproduce
|
[14:03] sustrik
|
and the error code is actually different (so your patch doesn't solve it)
|
[14:04] sustrik
|
gabor reports EBADF
|
[14:04] sustrik
|
while mac os x produces ENOENT
|
[14:25] djc
|
I have a situation with two inproc pub/subs, where second and later subscriber threads to connect to the second inproc channel seem to fail
|
[14:28] sustrik
|
what error?
|
[14:28] djc
|
connection refused
|
[14:29] djc
|
it appears that maybe the publisher thread fails
|
[14:29] djc
|
because a quick second subscriber does work
|
[14:29] djc
|
but they reliably start failing to connect after a while
|
[14:30] djc
|
(these are python threads, btw)
|
[14:31] sustrik
|
zmq_connect() returns connection refused?
|
[14:31] mikko
|
djc: does the bind happen before connect in all cases?
|
[14:31] sustrik
|
it's not an legal return code
|
[14:31] djc
|
mikko: yeah, the bind happens way before connect
|
[14:31] djc
|
sustrik: it mentions zmq/core/socket.c:4114
|
[14:32] djc
|
(this is with 2.1.7, but I didn't see anything particularly relevant in the NEWS)
|
[14:32] sustrik
|
there's no such file
|
[14:32] sustrik
|
not even the directory
|
[14:32] djc
|
it's probably pyzmq
|
[14:32] djc
|
http://paste.pocoo.org/show/498476/
|
[14:34] sustrik
|
hm, maybe there's a bug in 2.1.7 causing zmq_connect() return ECONNREFUSED instead of trying to reconnect
|
[14:35] sustrik
|
but i quite doubt it
|
[14:35] djc
|
it throws this ZMQError on any non-zero return from zmq_connect()
|
[14:35] sustrik
|
as "connect first, bind second" used to work even back then
|
[14:35] djc
|
according to the pyzmq code
|
[14:36] sustrik
|
irrespective of what the actual error is?
|
[14:36] sustrik
|
you should fill an error ticket in pyzmq bug tracker then
|
[14:37] djc
|
yeah
|
[14:37] djc
|
I'm checking if it's somehow wrapping the error code in the exception
|
[14:37] sustrik
|
the valid error codes are described here:
|
[14:37] sustrik
|
http://api.zeromq.org/2-1:zmq-connect#toc4
|
[14:38] djc
|
most of those seem rather unlikely if an earlier invocation of the same code works
|
[14:39] sustrik
|
find out what the actual error code is then
|
[14:41] mikko
|
sustrik: those dont seem to be right
|
[14:41] mikko
|
sustrik: if you connect inproc to a non-existent endpoint you get connection refused
|
[14:42] mikko
|
at least that seems to be the case with 2.1
|
[14:42] sustrik
|
yes, the auto-reconnect functionality for inproc is missing
|
[14:42] sustrik
|
djc: is it inproc transport?
|
[14:43] djc
|
sustrik: yeah
|
[14:43] mikko
|
sustrik: i think ECONNREFUSED should be added to that man page
|
[14:43] sustrik
|
mikko: yes
|
[14:43] djc
|
that's what I mentioned at the start :)
|
[14:43] sustrik
|
djc: there's unimplemented feature with inproc
|
[14:43] sustrik
|
it doesn't reconnect automatically upon failure
|
[14:44] djc
|
failure for what reason?
|
[14:44] sustrik
|
for example that there's nobody bound to the endpoint
|
[14:44] sustrik
|
so it returns ECONNREFUSED instead
|
[14:44] djc
|
ah! so if all the subscribers end, an inproc publishers would also end?
|
[14:45] djc
|
if so, you should mention that on http://api.zeromq.org/2-1:zmq-inproc
|
[14:45] mikko
|
djc: it shouldnt
|
[14:46] mikko
|
assuming it doesnt exit
|
[14:46] djc
|
well, that kind of describes the behavior I'm seeing
|
[14:47] mikko
|
djc: if your bound and exits and then rebinds that would probably cause error on clients
|
[14:48] djc
|
okay, it looks like my publisher is somehow ending
|
[15:17] CIA-79
|
jzmq: 03Gonzalo Diethelm 07master * rdaf4775 10/ pom.xml : Changed version to 1.0.0 in preparation to first oficial release. - http://git.io/bB309Q
|
[15:54] CIA-79
|
pyzmq: 03MinRK 07master * r6c5b78c 10/ zmq/utils/jsonapi.py : fix dumps->loads typo in jsonapi.__all__ ...
|
[16:09] mikko
|
sustrik: back
|
[16:10] mikko
|
sustrik: do you want me to look into this kqueue thing?
|
[16:10] mikko
|
is there a backtrace you could use?
|
[16:11] sustrik
|
mikko: hm
|
[16:11] sustrik
|
no idea what's going on there
|
[16:11] sustrik
|
we can start with the backtrace
|
[16:11] mikko
|
i'll try to ebay a cheap mac mini
|
[16:11] mikko
|
i can probably stick it into same place where build cluster is
|
[16:12] sustrik
|
that would be nice
|
[16:12] mikko
|
it would probably be beneficial as we have large amount of users on macs
|
[16:12] sustrik
|
yep, it's kind of annoying not to be able to reproduce the osx problems
|
[16:14] jond
|
mikko, sustrik perhaps dtrace be useful in looking at the problem....
|
[16:18] cremes
|
mikko, sustrik: i can provide an account to you guys on my osx box at work
|
[16:18] cremes
|
actually, i already did that for sustrik a while ago...
|
[16:18] cremes
|
account should still be active
|
[16:19] sustrik
|
cremes: great
|
[16:22] mikko
|
cremes: that would be great if you can provide sustrik with access
|
[16:22] mikko
|
can't see cheap imacs on ebay atm
|
[16:22] cremes
|
mikko: we're chatting privately about it :)
|
[16:24] mikko
|
cremes: is it a development server?
|
[16:24] mikko
|
i wonder if you might be able to run a build slave on it
|
[16:25] cremes
|
mikko: it's my desktop that i use for all development... it's pretty busy for 12 hours a day
|
[16:25] cremes
|
but should be available as a potential build slave for the remainder
|
[16:26] mikko
|
ah ok
|
[16:27] mikko
|
hmm
|
[16:27] mikko
|
it might be better if i ebay a mac mini
|
[16:27] cremes
|
probably :)
|
[16:27] cremes
|
but i'm willing to lend a hand in the short term if you need it
|
[16:29] mikko
|
short term it's more important for sustrik to have access
|
[16:29] mikko
|
so that he can fix all the weird ones
|
[16:39] cremes
|
mikko: he's got it
|
[16:44] sustrik
|
cremes, mikko: reproduced!
|
[16:44] sustrik
|
thanks!
|
[16:45] cremes
|
great news!
|
[16:51] mikko
|
sustrik: cool
|
[16:51] mikko
|
sustrik: it was easier on mac os x ?
|
[16:51] mikko
|
it might be that it's also easier to reproduce on older freebsd
|
[16:56] sustrik
|
it happened immediately
|
[17:34] crankycoder
|
has there been any progress on making 0mq safer from fuzzing? I saw a post about hardening it via a competition, but can't find any links to that competition.
|
[17:34] crankycoder
|
http://www.mail-archive.com/zeromq-dev@lists.zeromq.org/msg05592.html
|
[17:41] mikko
|
crankycoder: it's a work in progress
|
[17:42] crankycoder
|
is there anywhere to track it?
|
[17:42] crankycoder
|
i'm looking for a bug tracker or list of bugs outstanding
|
[17:59] Steve-o
|
enjoy packages for zeromq-java on windows. well 64-bit anyway.
|
[18:24] mikko
|
maybe we should add issue tracker url to topic
|
[18:44] collision
|
Hi, i'm using a REQ-REP socket
|
[18:45] collision
|
the client send arrives ok
|
[18:45] collision
|
but the server responses come corruput to the client
|
[18:50] collision
|
i found it
|
[18:51] collision
|
never mind
|
[21:59] tarcieri
|
guess I should chill here
|
[21:59] tarcieri
|
:)
|
[21:59] mikko
|
possibly
|
[21:59] cremes
|
tarcieri: i saw you ping me in rubinius... what's up?
|
[21:59] tarcieri
|
cremes: oh hey, just playing with ffi-rzmq
|
[22:00] tarcieri
|
had some more general questions so I figured I'd pop in here
|
[22:00] cremes
|
no! don't add 0mq to celluloid!
|
[22:00] cremes
|
:)
|
[22:00] tarcieri
|
lol
|
[22:00] cremes
|
i'll try to help... i have limited time as i'm going out tonight...
|
[22:00] cremes
|
what's your ??
|
[22:00] tarcieri
|
just trying to figure out how to map distributed Celluloid onto 0MQ
|
[22:01] tarcieri
|
I want to use one actor per link to another node
|
[22:01] tarcieri
|
and have that actor talk to a corresponding actor on the remote node using an exclusive pair
|
[22:01] cremes
|
ok
|
[22:02] tarcieri
|
it's just as I read about exclusive pairs the docs keep guilt tripping me into trying something else
|
[22:02] cremes
|
sure... PAIR offers no real benefit over regular bsd sockets
|
[22:03] tarcieri
|
seems like 0MQ would have me use inproc for ALL inter-actor communication
|
[22:03] tarcieri
|
and like what, marshal every time?
|
[22:03] tarcieri
|
seems bad
|
[22:03] cremes
|
yes... inproc does a little "pointer flip" behind the scenes but you can't use that with ruby
|
[22:04] cremes
|
since objects move in memory due to GC
|
[22:04] tarcieri
|
yeah
|
[22:04] cremes
|
so you have to marshal
|
[22:04] tarcieri
|
especially in, say, JRuby
|
[22:04] tarcieri
|
which is kinda the #1 thing I'm targeting right now
|
[22:04] cremes
|
it's a pain in the ass... we need a Object#pin/Object#unpin to allow for locking objects in place
|
[22:04] tarcieri
|
so anyway Celluloid already has primitives for in-VM messaging
|
[22:04] tarcieri
|
and I'm not going to rip them out and replace them with 0MQ
|
[22:05] tarcieri
|
although I could offer 0MQ-based actors if I wanted, i just don't see the point really :/
|
[22:05] cremes
|
well, 0mq-based actors when they are spread over multiple machines
|
[22:05] tarcieri
|
yeah, indeed
|
[22:05] cremes
|
on a single box it doesn't offer you mcuh benefit
|
[22:05] tarcieri
|
so like... you have N nodes that all need to talk to each other
|
[22:06] grantr
|
tarcieri, sounds like a job for xrep/xreq (or dealer/router, depending on who you're talking to)
|
[22:06] tarcieri
|
at least for the basic actor protocol I don't really see the usefulness of anything but pairs
|
[22:06] cremes
|
right
|
[22:06] tarcieri
|
grantr: Celluloid messaging is fully asynchronous though
|
[22:07] grantr
|
yep, so is xre[pq]
|
[22:07] cremes
|
grantr: when using ZM_NOBLOCK
|
[22:07] grantr
|
they dont enforce the request-reply ordering
|
[22:07] grantr
|
cremes, right
|
[22:08] tarcieri
|
grantr: so if I understand them correctly... I think they're a bad fit
|
[22:08] tarcieri
|
grantr: there's not necessarily replies to any message
|
[22:08] cremes
|
tarcieri: have you read the guide? zero.mq/zg
|
[22:09] cremes
|
it's the best doc to use for getting a foundational understanding of 0mq
|
[22:09] grantr
|
tarcieri, try this for an overview of the socket types http://api.zeromq.org/3-0:zmq-socket
|
[22:09] cremes
|
and it is possible that 0mq isn't a good fit
|
[22:12] grantr
|
cremes, why shouldn't tarcieri add 0mq to celluloid?
|
[22:13] tarcieri
|
so the thing is
|
[22:13] cremes
|
grantr: why use it if it offers no benefit over bsd sockets?
|
[22:13] tarcieri
|
I can see various other socket types being useful for specific cases
|
[22:13] cremes
|
it would just be a wasted dependency
|
[22:13] tarcieri
|
those wouldn't be part of the general actor communication protocol though
|
[22:14] tarcieri
|
but the thing is, if I want to incorporate those types later, shouldn't I start with 0mq, or is that a waste of time?
|
[22:14] grantr
|
cremes, i mean this: <cremes> no! don't add 0mq to celluloid!
|
[22:14] tarcieri
|
particularly pubsub and push/pull
|
[22:14] tarcieri
|
I assume that was facetious
|
[22:14] grantr
|
i thought so too :)
|
[22:15] cremes
|
grantr: oh, that first response was a joke :)
|
[22:15] grantr
|
as i see it, the advantage of using zmq for inter-node communication - abstracts socket setup/teardown, abstracts failure handling, abstracts buffering
|
[22:16] grantr
|
dunno if its worth the dependency
|
[22:16] cremes
|
tarcieri: a lot of the internal 0mq classes reflect the message-passing aesthetic of actors... mailboxes and whatnot
|
[22:16] cremes
|
fyi
|
[22:17] cremes
|
grantr is right... it does help with those issues but i suggest you remember YAGNI
|
[22:17] cremes
|
you might only ever use PAIR in which case i think the dependency is pointless
|
[22:18] grantr
|
for my use case with celluloid, the other socket types are a big help so it makes sense for ME to use zmq in other places too
|
[22:18] grantr
|
i could see if the only thing you need it for is casting rpc to other nodes, maybe not worth the dependency (since it can be a pretty gnarly dependency to resolve)
|
[22:19] cremes
|
sorry folks... gotta run; i'll check the scrollback in a few hours
|
[22:20] tarcieri
|
I think pubsub is the biggest use case
|
[22:21] grantr
|
thats the one i need
|
[22:21] tarcieri
|
the biggest non-pair use case, that is
|
[22:23] tarcieri
|
cremes: well provided you check the scrollback, I did want to confirm that ZMQ::Contexts are thread safe
|
[22:23] grantr
|
tarcieri, contexts are, sockets are not
|
[22:23] tarcieri
|
I'd like to share one across all threads if possible
|
[22:23] tarcieri
|
yeah
|
[22:23] tarcieri
|
cool
|
[22:23] grantr
|
contexts are intended to be one per process (as i understand it)
|
[22:23] tarcieri
|
seems good
|