[Time] Name | Message |
[00:06] leiger
|
Hey, is Pieter here?
|
[00:22] cremes
|
sustrik: got a repro for you here: https://github.com/zeromq/zeromq2/issues/171
|
[00:22] cremes
|
it affects tcp and ipc transports
|
[00:23] cremes
|
sustrik: if you'd like, i can set this up under your account on my box so you can see it since i assume you don't plan
|
[00:23] cremes
|
to install ruby and the dependent libraries
|
[00:23] cremes
|
let me know
|
[00:34] lt_schmidt_jr
|
Hi guys, is anyone compiling zmq for multiple platforms?
|
[00:35] lt_schmidt_jr
|
anyone do jar packaging for binaries for java?
|
[00:56] kdj
|
I am having a problem uninstalling an older version of pyzmq...
|
[00:56] kdj
|
I mean a newer version, actually
|
[00:56] kdj
|
I have pyzmq 2.1.0 installed, I want to revert to pyzmq2.0.10
|
[00:57] kdj
|
I remove the directory and .egg from site-packages, and the module seems removed...
|
[00:57] kdj
|
But when I re-install it, the zmq_version() is still 2.1.0, *except* for in the shell I installed from O_o
|
[00:57] kdj
|
This is in Windows
|
[00:59] kdj
|
Oh, it is because I am in the directory in which zmq is located. I still don't get how they're not the same...
|
[01:37] erickt
|
hello #zeromq. I noticed that the HEAD of zeromq2 doesn't seem to be building and installing the documentation.
|
[01:42] jugg
|
erickt, check the output of ./configure, you're probably missing a dependency.
|
[01:44] Steve-o
|
which should be asciidoc
|
[01:45] erickt
|
I got both asciidoc and xmlto installed. "make dist" properly builds all the docs
|
[01:47] Steve-o
|
didn't someone catch this on the list recently
|
[01:48] erickt
|
ah yeah the documentation does build correctly with v2.1.0
|
[01:49] Steve-o
|
or maybe a new side effect with the new api.zero.mq support
|
[02:00] erickt
|
oh, haha, it's all my fault
|
[02:01] erickt
|
I'm trying to get zeromq to build outside of the checkout directory and I messed up getting the doc builds to work :)
|
[04:54] cremes
|
andrewvc: ping
|
[04:54] andrewvc
|
yo
|
[04:54] andrewvc
|
sitting next to evanphx right now
|
[04:55] cremes
|
get it up on flickr
|
[04:55] cremes
|
:)
|
[04:55] andrewvc
|
lol
|
[04:55] andrewvc
|
we're looking at that zmq bug actually, going through mailbox.cpp
|
[04:55] cremes
|
so which bug are you and evan chatting about?
|
[04:55] andrewvc
|
I think yours was a separate one, something about kernel buffers?
|
[04:55] andrewvc
|
this is the one I hit a while ago that jruby doesn't have
|
[04:56] cremes
|
the new_sndbuf > old_sndbuf assertion?
|
[04:56] andrewvc
|
mmm this is nbytes != -1 EBADF
|
[04:57] andrewvc
|
mailbox.cpp 241
|
[04:57] andrewvc
|
that one
|
[04:57] cremes
|
ah, your file descriptor one
|
[04:57] andrewvc
|
yep
|
[05:00] cremes
|
btw, i think 0mq is a pretty good exercise of rbx's ffi impl
|
[05:44] andrewvc
|
cremes: still around?
|
[05:45] cremes
|
barely; what's up?
|
[05:46] andrewvc
|
yo, not much, but have a bit of a lead on the issue
|
[05:47] andrewvc
|
basically, zmq is shutting down, but a call to zmqgetsockopt gets through
|
[05:47] andrewvc
|
though the socket has been closed zmq doesn't see it
|
[05:47] andrewvc
|
btw, evan was saying we should probably not use finalizers, as they can be dangerous
|
[05:47] cremes
|
so it's shutting down when you do the getsockopt call?
|
[05:47] andrewvc
|
basically, it's easy to have to FFI pointers to the same data
|
[05:47] andrewvc
|
it shuts down before
|
[05:47] andrewvc
|
but the getsockopt gets through later
|
[05:48] cremes
|
hrm...
|
[05:48] andrewvc
|
and zmq blows up, because an internal FD it uses for that socket, an FD for a socketpair, has been closed
|
[05:48] cremes
|
so what's this about finalizers? don't use 'em for auto-gc?
|
[05:48] andrewvc
|
the cpp mailbox object seems to have been deleted, but not overwritten, but the ffi socket still points to it
|
[05:48] andrewvc
|
that's from evan btw
|
[05:49] cremes
|
the ffi socket...?
|
[05:49] andrewvc
|
ZMQ::Socket
|
[05:49] andrewvc
|
the actual pointer to the socket
|
[05:49] cremes
|
the socketpairs are used for internal signaling, so they aren't exposed to us via ffi at all
|
[05:49] cremes
|
ok, zmq.socket
|
[05:49] andrewvc
|
yeah, but I think they're still tied to the socket
|
[05:50] evan
|
hi cremes
|
[05:50] andrewvc
|
evan's going to pop in
|
[05:50] evan
|
:D
|
[05:50] cremes
|
howdeee
|
[05:50] evan
|
so, real fast
|
[05:50] evan
|
here's how I see it.
|
[05:50] andrewvc
|
lol
|
[05:50] evan
|
FFI -> zmq::getsockopt -> zmq::process_commands -> zmq::mailbox::recv
|
[05:51] evan
|
those last 3 are in C++, in zmq's code base
|
[05:51] cremes
|
hey, when you work 22 hours a day, a dram is all the vacation you need :)
|
[05:51] evan
|
the FFI is called from the getsockopt method
|
[05:51] cremes
|
right
|
[05:51] evan
|
ZMQ_EVENTS is the option requested
|
[05:51] cremes
|
k
|
[05:51] evan
|
it appears that zmq has deleted the mailbox in question
|
[05:52] evan
|
we're still able to access it because the FFI side has a hanging pointer to where it used to be
|
[05:52] evan
|
and it hasn't been overriden
|
[05:52] evan
|
I believe it has been deleted because I can see that the destructor for the mailbox is what closes the socketpari
|
[05:52] evan
|
and I know that we're getting the EBADF on the socketpair in a mailbox
|
[05:52] cremes
|
ok, then that's a bug
|
[05:53] evan
|
all of that leads me to believe that zmq has decided to shutdown
|
[05:53] evan
|
either because of an internal condition
|
[05:53] evan
|
or because it was requested to
|
[05:53] evan
|
but FFI was not told zmq was shutting down
|
[05:53] evan
|
and thus still holds pointers to the C++ objects that have been deleted
|
[05:53] cremes
|
the socketpair is used for signaling between the virtual socket and the i/o thread that handles all send/recv on the back-end
|
[05:53] cremes
|
so if what you are saying is true
|
[05:54] cremes
|
then we need to show a repro and get them to fix that up
|
[05:54] cremes
|
the socket should only shut down under 2 conditions
|
[05:54] andrewvc
|
well, I think that's all in the strace and backtrace I showed previously
|
[05:54] cremes
|
1. zmq_close() is called on the socket
|
[05:54] andrewvc
|
but I didn't frame it right
|
[05:54] andrewvc
|
well, I never called zmq_close on the socket
|
[05:54] cremes
|
2. zmq_term() is called on the context to shut *everything* down
|
[05:54] andrewvc
|
we're not sure why it's being shut down
|
[05:54] cremes
|
ok, then 3. bug
|
[05:55] andrewvc
|
in my app I never call those things, it could be related to a ruby exception, or perhaps an exception in eventmachine's cext
|
[05:55] andrewvc
|
since moving to jruby fixed the issue
|
[05:55] evan
|
#1 and #2 need to be verified that they're not happening
|
[05:55] evan
|
because of the extra indirection of the FFI binding
|
[05:55] andrewvc
|
yep
|
[05:55] cremes
|
yeah, breakpoints on those c++ functions
|
[05:55] evan
|
it seems possible that zmq_term was called (perhaps by a finalizer?)
|
[05:56] cremes
|
only if a context is garbage collected
|
[05:56] evan
|
sure
|
[05:56] cremes
|
and *only* under 2.1.0... it activates that code based on the version of the 0mq lib that is loaded
|
[05:56] evan
|
basically, you need to verify if those are happening.
|
[05:56] evan
|
or not happening
|
[05:56] evan
|
when the bug occurs
|
[05:56] andrewvc
|
oh, well this was off zmq HEAD
|
[05:56] evan
|
only then can you proceed to #3
|
[05:56] cremes
|
agreed (evan)
|
[05:57] evan
|
additionally, you can certainly repro this all in C++
|
[05:57] cremes
|
maybe andrewvc can ... ;)
|
[05:57] evan
|
you can see if calling zmq_term, then zmq_getsockopt causes an EBADF
|
[05:57] evan
|
it seems likely it would :)
|
[05:57] andrewvc
|
lol, I'd actually really like to, I'll put some more time into this this weekend
|
[05:57] cremes
|
that's easy enough to try
|
[05:58] cremes
|
there are certainly bugs in 0mq; i found another today that i've been chasing for the last 4 days
|
[05:58] cremes
|
i sent them a repro (in ruby)
|
[05:58] evan
|
cremes: yeah, I told andrewvc about that one.
|
[05:58] evan
|
the buffer bug.
|
[05:58] cremes
|
yeah, the same code also crashes rbx... right
|
[05:58] andrewvc
|
yeah
|
[05:59] cremes
|
well, this is all pretty exciting
|
[05:59] cremes
|
thanks for taking some time to dive into this... are you guys as a BoF or something?
|
[05:59] cremes
|
s/as/at/
|
[05:59] andrewvc
|
we're both usually at LA Ruby hack nights on tues
|
[05:59] cremes
|
ah...
|
[06:00] andrewvc
|
actually evan's a cofounder, him and shane becker run it
|
[06:00] cremes
|
cool.... well, you certainly brought in a knotty problem tonight
|
[06:00] andrewvc
|
what's a BoF
|
[06:00] cremes
|
birds of a feather
|
[06:00] cremes
|
it's a wwdc term
|
[06:00] andrewvc
|
yeah, I'm glad evan took a look at it, really interesting stuff going through zeromq internals tracking it down
|
[06:01] cremes
|
heh, i bet evan had some comments on that codebase
|
[06:01] andrewvc
|
lol
|
[06:01] andrewvc
|
how did you know
|
[06:01] cremes
|
i've seen rbx c++ and it's nothing like 0mq c++
|
[06:01] cremes
|
:)
|
[06:01] andrewvc
|
hehe
|
[06:01] cremes
|
d'oh! this is all in-channel! heh heh
|
[06:02] evan
|
cremes: yeah, I did
|
[06:02] evan
|
*eyeroll*
|
[06:02] evan
|
:D
|
[06:02] cremes
|
that's why i'm a ruby guy; i have a hard enough time making that look good let alone attempt c++ anymore
|
[06:02] cremes
|
!!
|
[06:02] cremes
|
well, thanks for digging into this
|
[06:02] cremes
|
i
|
[06:03] cremes
|
think between andrewvc and i we can probably produce a repro for these guys
|
[06:03] cremes
|
that ZMQ_EVENTS code is still pretty new
|
[06:03] cremes
|
and likely needs more shake-down
|
[06:04] cremes
|
andrewvc: are you doing your dripdrop testing under rbx full-time yet?
|
[06:04] andrewvc
|
hah, switched to jruby because of this bug
|
[06:04] andrewvc
|
but if I get this back under control, I'm back on rbx
|
[06:04] cremes
|
pussy
|
[06:04] cremes
|
;)
|
[06:04] evan
|
cremes: cool.
|
[06:04] andrewvc
|
lol, well, I like my stack traces with my errors
|
[06:05] cremes
|
it doesn't make a lot of sense that this would work under jruby but break under rbx, yes?
|
[06:05] andrewvc
|
well, eventmachine has a different java core
|
[06:05] cremes
|
zmq should behave the same in both scenarios
|
[06:05] andrewvc
|
so maybe there's an interaction with the cext
|
[06:05] evan
|
also
|
[06:05] andrewvc
|
or maybe I'm crazy
|
[06:05] cremes
|
crazy
|
[06:05] evan
|
if it is something with finalizers
|
[06:05] evan
|
jruby could have very different finalizer run patterns than rbx
|
[06:05] evan
|
jsut throwing that out there.
|
[06:06] andrewvc
|
yep, I'll try removing finalizers as well
|
[06:06] cremes
|
finalizers are easy to disable; andrewvc : give that a try
|
[06:06] andrewvc
|
I definitely will
|
[06:06] cremes
|
yeah, the finalizers could be the culprit
|
[06:06] cremes
|
most of these commands run async on the i/o thread so timing can play a part
|
[06:07] evan
|
cremes: i'm packing up my laptop
|
[06:07] andrewvc
|
it was always reproducible in sequence though
|
[06:07] andrewvc
|
I as well, they're closing down
|
[06:07] evan
|
talk to ya tomorrow, back on the battlefield!
|
[06:07] andrewvc
|
I'll ttyl chuck
|
[06:07] cremes
|
evan: g'night
|
[06:07] cremes
|
goodnight to all!
|
[06:07] cremes
|
happy hacking
|
[07:31] sustrik
|
andrewvc,evan,cremes: just a minor note:
|
[07:31] andrewvc
|
yo
|
[07:31] sustrik
|
in 2.1, zmq_term() should not invalidate the socekts
|
[07:31] sustrik
|
what it does
|
[07:31] sustrik
|
it causes the socket to return ETERM on any subsequent call
|
[07:31] sustrik
|
and wait for the user to close the socket
|
[07:32] sustrik
|
so you should not see EBADF in such case
|
[07:32] sustrik
|
if you do, it's a bug
|
[07:32] andrewvc
|
oh, cool, btw I found the answer to that issue I had sustrik
|
[07:32] andrewvc
|
the weird one with the strace and everything
|
[07:32] sustrik
|
what was that?
|
[07:33] andrewvc
|
Well, haven't tested it fully yet, but I'm fairly sure that eventmachine was catching a ruby exception, and then trying to shutdown() and close() all its FDs
|
[07:33] andrewvc
|
including the ZMQ ones
|
[07:33] andrewvc
|
which is not good
|
[07:33] sustrik
|
i see
|
[07:33] andrewvc
|
thanks for taking a look at it though
|
[07:33] sustrik
|
so you've used closed socket afterwards, right?
|
[07:34] andrewvc
|
well, yeah, the zmq_socket wasn't closed, but the FD that ZMQ_FD exposes was
|
[07:34] sustrik
|
oh my
|
[07:34] sustrik
|
:)
|
[07:34] andrewvc
|
yeah, that's why it was so odd to track down lol
|
[07:35] andrewvc
|
thanks for the heads up on zmq_term
|
[08:05] CIA-21
|
zeromq2: 03Martin Sustrik 07master * r820fec7 10/ include/zmq.h :
|
[08:05] CIA-21
|
zeromq2: Version bumped to 2.2.0
|
[08:05] CIA-21
|
zeromq2: Signed-off-by: Martin Sustrik <sustrik@250bpm.com> - http://bit.ly/gApquE
|
[08:33] ianbarber
|
the blog post does kind of just dismiss the idea that a event driven/actor model is practical - but the issue of losing control flow he describes are basically true of any callback based language, like javascript for example. I would also think the control flow problems would occur in any language which is more declarative than imperative, and there are many succesful examples of those too, so I'm not so sure that the negative poin
|
[08:33] ianbarber
|
good one
|
[08:33] ianbarber
|
the stuff about STM looks good, just seems the case against message passing is a bit thin
|
[08:36] ianbarber
|
even the paper references, the base MP implementation is in C, for students, which is not necessarily a sensible thing. I would suspect if it were given to Erlang or javascript coders, you'd have different results, as there is a slightly different mindset.
|
[08:36] Guthur
|
I had a little trouble buying into his arguments against MP, but I thought maybe it was because I was sold on 0MQ as one solution to concurrency problems
|
[08:39] sustrik
|
i think he's right that implementing a complex application using MP can be really complex
|
[08:39] sustrik
|
you have hundreds or thousands of components
|
[08:39] guido_g
|
hrhrhr
|
[08:39] sustrik
|
sending messages randomly among them
|
[08:40] sustrik
|
you have little idea of how ordering is going to look like
|
[08:40] guido_g
|
isn't implementig a complex application always... complex?
|
[08:40] sustrik
|
etc.
|
[08:40] sustrik
|
yes, what he's saying is that complexity of MP solution on shared-data solution is more or less the same
|
[08:41] Guthur
|
well, I suppose it's better than the complexity multiplier of locking strategies
|
[08:41] sustrik
|
hard to say, some research would be needed
|
[08:42] sustrik
|
what really makes difference imo is being able to decompose message flows
|
[08:42] sustrik
|
into sensible well defined subsystems
|
[08:42] ianbarber
|
yeah
|
[08:43] guido_g
|
the message as/is the interface
|
[08:43] sustrik
|
which, of course, is what message patterns do
|
[08:43] sustrik
|
you can't send messages randomly back and forth
|
[08:43] ianbarber
|
i think that for practically building complex systems, it is often about choosing the right pre-existing tools to do the job, the building on to your own
|
[08:43] sustrik
|
you have to say: this is my pub/sub channel
|
[08:43] sustrik
|
and that's what it is
|
[08:45] sustrik
|
so you have clear idea how the communication work
|
[08:45] ianbarber
|
you know, there's a lot about distributed processing different types of work, but if i have to do one kind i'll use piccolo, another a database, and ideally you'd have a good substrate for having those bits communicate efficiently, and 0mq is good for that. i see a fair few projects that I wish they had used it on the underlying layer, as it would be more reliable than their current communication solution, and more extensible at th
|
[08:45] ianbarber
|
level
|
[08:46] sustrik
|
yes
|
[08:47] sustrik
|
i think any programmer that's in business for some time have written at least a couple of ad hoc communication solutions :)
|
[08:47] Guthur
|
Microsoft Research were looking into STM in C# (SXM), but the last update was 2005, seems a long time ago
|
[08:48] sustrik
|
the problem with STM, obviosuly, is that it doesn't scale
|
[08:49] Guthur
|
we are finding scaling issues with our inter threading stuff already
|
[08:50] Guthur
|
I couldn't sell them on MP(0MQ), so they decided to try with explicit locking
|
[08:50] Guthur
|
it's taken at least twice as long, so far, and it still doesn't work
|
[08:52] Guthur
|
Thats why I was looking for concurrency articles, I really want to convince them of the difficulty in developing scalable solutions for our problem
|
[08:52] sustrik
|
what's "inter threading"?
|
[08:52] Guthur
|
sorry, I meant communicating between threads
|
[08:53] sustrik
|
i see
|
[08:53] Guthur
|
it's still that same problem I have talked about a week or 2 ago
|
[08:53] Guthur
|
have/had
|
[08:55] Guthur
|
N request threads waiting for data from async reply threads
|
[09:05] pieterh
|
Guthur: are we still discussing that blog post?
|
[09:05] pieterh
|
What he seems to be saying is "I don't have decent MP patterns, so I'm going to invent a better shared memory technology"
|
[09:07] pieterh
|
it is a straw man blog post
|
[09:08] pieterh
|
sustrik_ thanks for the NEWS update, how do we apply this to the two branches now?
|
[09:19] pieterh
|
mikko: you around?
|
[09:20] sustrik
|
pieterh: the news are specific to the release
|
[09:20] sustrik
|
thus 2.1.1 release notes make no sense in 2.2 branch
|
[09:20] pieterh
|
sustrik_ ... the NEWS in your master should be a complete history, no?
|
[09:20] pieterh
|
do you intend to remove old release notes?
|
[09:21] pieterh
|
it goes back to version 0.2...
|
[09:22] sustrik
|
but it doesn't make sense once you have two branches
|
[09:22] pieterh
|
i really want my branch to be passive, not fragment the main git
|
[09:24] pieterh
|
everything in your NEWS file is for historical interest
|
[09:24] pieterh
|
I'd recommend that remain a complete record of all officially released versions
|
[09:24] pieterh
|
if someone forks a branch and makes their own patches, that's different
|
[09:25] pieterh
|
but this is about a continuous recorded history
|
[09:25] sustrik
|
the problem is that there are overlaps
|
[09:25] pieterh
|
for example the same patch on 2.2 and 2.1?
|
[09:25] sustrik
|
yes
|
[09:25] pieterh
|
we handled that in OpenAMQ quite easily
|
[09:26] sustrik
|
but the main problem is there's *no* 2.1 history in 2.2 repo
|
[09:26] sustrik
|
so placing 2.1 release notes there is inconsistent
|
[09:26] pieterh
|
there is no 2.1 tag in 2.2 repo, that's true
|
[09:27] pieterh
|
I don't know what you mean by "there is no 2.1 history"
|
[09:27] sustrik
|
git repo doesn't contain the 2.1 branch
|
[09:27] pieterh
|
a branch is not history
|
[09:27] sustrik
|
you've created separate repo yesterday
|
[09:28] pieterh
|
a branch is just an access path
|
[09:28] sustrik
|
so the NEWs in 2.2 repo would refer to code in 2.1 repo
|
[09:28] pieterh
|
yes
|
[09:28] pieterh
|
and since we're coordinating between the two repos, using pull requests, etc. that's normal
|
[09:29] pieterh
|
look, I can fork the NEWS file, np, but it is IMO wrong and we'll have to fix it up later
|
[09:29] sustrik
|
it's ok imo
|
[09:29] pieterh
|
this will lead to the 2.1 repo becoming detached in some sense
|
[09:30] sustrik
|
that's the goal, isn't it
|
[09:30] pieterh
|
that only makes sense if we likewise detach a 2.0 repo at some point
|
[09:30] pieterh
|
in that case, yes
|
[09:30] sustrik
|
you can detach 2.0 if you want
|
[09:30] sustrik
|
i'm happy with that
|
[09:31] pieterh
|
happy = formal approval? :-)
|
[09:31] sustrik
|
yes. i'll be happy to get rid of it :)
|
[09:31] pieterh
|
ok, we have a good plan then
|
[09:31] pieterh
|
get 2.0 out of your hair as well
|
[09:31] pieterh
|
and provide an easy way to maintain these parallel versions
|
[09:32] pieterh
|
we do have people using 2.0 in production and there will be patches on it at some point
|
[09:32] pieterh
|
so we're into three rolling versions, which is where I like to be...
|
[09:32] pieterh
|
boring, normal, exciting...
|
[09:33] sustrik
|
ok
|
[09:33] pieterh
|
sustrik_: thanks, I have everything I need to make the release now
|
[09:33] pieterh
|
just let me grab a coffee...
|
[09:34] pieterh
|
oh, everyone, I have news that Brett Cameron (who ported 0MQ to OpenVMS), who lives in Christchurch NZ, is safe, though his house was badly damaged
|
[09:53] mikko
|
pieterh: am now
|
[09:54] pieterh
|
mikko: just in time... what's the status of OpenPGM wrt MSVC on the master?
|
[09:54] mikko
|
pieterh: builds without issues
|
[09:54] mikko
|
there is WithOpenPGM build target
|
[09:54] mikko
|
no wait
|
[09:55] pieterh
|
excellent, that's what... wait...
|
[09:55] mikko
|
it's broken
|
[09:55] pieterh
|
ok, that's fine
|
[09:55] mikko
|
http://build.zero.mq/job/ZeroMQ2-core-master_MSVC-win7/
|
[09:55] mikko
|
it's fixed it seems. there was an issue using different definition of bool between the two iirc
|
[09:56] mikko
|
http://snapshot.zero.mq/msvc2008/ these snapshots are built with pgm enabled
|
[09:56] mikko
|
so in theory people can just download the dll and import library
|
[09:56] pieterh
|
lovely
|
[09:56] pieterh
|
I'm making the 2.1.1 (rc1) release now
|
[09:57] mikko
|
this means that autoconf builds for openpgm have to wait until 2.2.x ?
|
[09:57] pieterh
|
we can try to get them into rc2, np
|
[09:57] mikko
|
ok cool
|
[09:58] pieterh
|
i'm aiming at three releases, so stable after rc2, depending on how many issues we have
|
[09:58] mikko
|
people are gonna test it when it's stable anyway
|
[09:58] pieterh
|
you think people won't use an rc?
|
[09:59] mikko
|
well, the large majority will only jump ship when it's stable
|
[09:59] pieterh
|
we'll see, if no-one reports issues on the rcs then next time we can lie and just push straight from git master to stable
|
[09:59] pieterh
|
and then continue with service packs... :-)
|
[10:00] mikko
|
paid-for service packs
|
[10:00] mikko
|
like windows back in the day
|
[10:01] sustrik
|
pieterh: just keep on incrementing the third version number
|
[10:01] sustrik
|
no need for dourth one
|
[10:01] sustrik
|
fourth
|
[10:01] pieterh
|
sustrik_: yes, that's always the case but the problem is the 'beta' vs. 'stable' terminology
|
[10:01] sustrik
|
a
|
[10:01] sustrik
|
ok
|
[10:01] pieterh
|
we need a finer, more precise terminology
|
[10:02] pieterh
|
in fact there are three levels afaics
|
[10:02] pieterh
|
ABI version, package/software version, stability level
|
[10:02] sustrik
|
the stability level is pretty informal concept
|
[10:03] pieterh
|
yet it's the most critical for users
|
[10:03] sustrik
|
one man's beta is another man's stable
|
[10:03] mikko
|
patch level some say
|
[10:03] pieterh
|
no, it's a statement of collective confidence
|
[10:03] pieterh
|
"you can trust this for production"
|
[10:03] pieterh
|
"we recommend you use this for new projects"
|
[10:03] pieterh
|
"this version is boring and old but very well known"
|
[10:03] pieterh
|
etc.
|
[10:04] mikko
|
http://apr.apache.org/versioning.html
|
[10:04] pieterh
|
mikko: APR has been "boring and old and very well known" since 1999
|
[10:04] pieterh
|
ok, I'll make it up as I go along and if anyone dislikes my terminology, just shout
|
[10:05] mikko
|
sustrik_: is there a plan to merge zmq devices into one binary?
|
[10:06] pieterh
|
mikko: I'd like to remove device main programs from the core
|
[10:06] sustrik
|
there's a plan to drop the devices altogether
|
[10:06] pieterh
|
I already did this in 2.0.7 or so but it was rolled back
|
[10:06] sustrik
|
yeah, but can't do that till 3.0
|
[10:06] pieterh
|
sustrik_: why not? it's not an API issue
|
[10:06] pieterh
|
the *main* programs, not the zmq_devices() function
|
[10:06] sustrik
|
it's a backward comaptibility issue
|
[10:07] pieterh
|
sustrik_: how?
|
[10:07] sustrik
|
you install a new version and suddenly the devices you were using are no longer there
|
[10:07] pieterh
|
the device main programs are not even documented, nor finished, nor really usable
|
[10:08] pieterh
|
no configuration, no command line control, nothing
|
[10:08] mikko
|
they have xml config
|
[10:08] pieterh
|
sorry, yes, they do
|
[10:08] pieterh
|
I'd like to move them out of core ASAP
|
[10:08] sustrik
|
i've promised to keep backwards compatibility till next major version
|
[10:08] sustrik
|
and i'll keep the promise
|
[10:08] pieterh
|
sustrik_: it's for APIs, not these addons
|
[10:09] pieterh
|
that's what it says in the contract
|
[10:09] pieterh
|
"Updates don't change API, ABI, wire protocol or semantics. "
|
[10:09] pieterh
|
it's quite explicit
|
[10:10] pieterh
|
Plus, "New minor version allows for new functionality (i.e. new APIs), slight changes to the semantics etc."
|
[10:10] sustrik
|
exactly
|
[10:10] sustrik
|
command lines are part of the API
|
[10:10] pieterh
|
not unless they are documented
|
[10:10] pieterh
|
the devices are not documented
|
[10:10] sustrik
|
:)
|
[10:10] pieterh
|
I rest my case, m'lod
|
[10:10] pieterh
|
further... as long as they sit in core, they are not improved
|
[10:11] sustrik
|
yes, they are freezed
|
[10:11] pieterh
|
I have far superior code ready
|
[10:11] sustrik
|
marked to be removed in 3.0
|
[10:11] pieterh
|
bleh
|
[10:11] sustrik
|
backward compatibility
|
[10:11] sustrik
|
shrug
|
[10:11] pieterh
|
I bet you a beer: ask on zeromq-dev if anyone objects to moving these today to another repo
|
[10:11] pieterh
|
each objection, 1 beer I owe you
|
[10:12] sustrik
|
it's not about concensus, it's about contract
|
[10:12] sustrik
|
so that even those that are not on the ml have clear guarantees
|
[10:12] pieterh
|
so point me to the contract that covers these device main programs
|
[10:12] sustrik
|
about upgrading
|
[10:12] pieterh
|
either documentation
|
[10:12] pieterh
|
or explicit contract...
|
[10:12] pieterh
|
where?
|
[10:13] sustrik
|
"Updates don't change API, ABI, wire protocol or semantics"
|
[10:13] pieterh
|
SYNOPSIS
|
[10:13] pieterh
|
To be written.
|
[10:13] pieterh
|
DESCRIPTION
|
[10:13] pieterh
|
To be written.
|
[10:13] pieterh
|
OPTIONS
|
[10:13] pieterh
|
To be written.
|
[10:13] pieterh
|
if you define undocumented command lines as "API"... well...
|
[10:13] pieterh
|
fine
|
[10:13] sustrik
|
application interface
|
[10:13] sustrik
|
that's it
|
[10:14] pieterh
|
"To be written" is not an API contract
|
[10:14] pieterh
|
sorry, but it's not
|
[10:14] pieterh
|
"read the code" is not a contract
|
[10:14] sustrik
|
the API itself is the contract
|
[10:14] pieterh
|
documentation is the very basis of a contract
|
[10:14] pieterh
|
you have used that argument yourself on multiple occasions
|
[10:14] pieterh
|
"I've deliberately not documented it so that I can change or remove it"
|
[10:14] pieterh
|
well, whatever
|
[10:14] sustrik
|
yes, but then i've got yelled on on several occassions
|
[10:15] sustrik
|
so i've written the contract
|
[10:15] pieterh
|
shrug
|
[10:15] sustrik
|
let me make the cotract more precise about the guarantees...
|
[10:16] mikko
|
i think martin has a good point here
|
[10:16] mikko
|
this might break for example possible automated builds people are using
|
[10:16] mikko
|
etc
|
[10:16] pieterh
|
basically, by insisting that these addons remain in core, you're throttling the use of devices
|
[10:16] pieterh
|
mikko: it's for a 2.2 release at best
|
[10:16] sustrik
|
wait a sec
|
[10:17] sustrik
|
why throttling?
|
[10:17] mikko
|
pieterh: what do you mean by that?
|
[10:17] pieterh
|
well, look at the code, and docs
|
[10:17] pieterh
|
compare to...https://github.com/zeromq/zfl/tree/master/examples
|
[10:17] mikko
|
can we document that these devices are provided as an example? and setup a separate wiki page for devices people create?
|
[10:17] sustrik
|
people should write their own devices
|
[10:17] pieterh
|
mikko asks a question that was resolved ages ago
|
[10:17] sustrik
|
or you can provide them as a library
|
[10:18] pieterh
|
done that, done that
|
[10:18] sustrik
|
what's the problem then?
|
[10:18] pieterh
|
but as long as these remain in zeromq2, people will assume those are the only official devices
|
[10:18] pieterh
|
and they are not maintained
|
[10:18] pieterh
|
and they are not documented
|
[10:18] sustrik
|
ok, let me fix the docs
|
[10:18] pieterh
|
and they are rubbish
|
[10:18] pieterh
|
sorry, but XML dependencies in 0MQ core is rubbish
|
[10:18] sustrik
|
it'll say: obsolete, marked for removal in 3.0
|
[10:18] mikko
|
hi Steve-o
|
[10:18] Steve-o
|
hi mikko
|
[10:18] mikko
|
Steve-o: minor build problem: solaris studio 12.2 + autoconf
|
[10:19] mikko
|
i don't actually know what the error means
|
[10:19] Steve-o
|
wondered what changes I need for Solaris
|
[10:19] mikko
|
this is on linux
|
[10:19] Steve-o
|
I know the ticket locks will be broken
|
[10:19] Steve-o
|
the concept doesn't work on SPARC processors
|
[10:19] mikko
|
this is x86 linux + solaris studio 12.2
|
[10:20] Steve-o
|
oracle studio I think its called now?
|
[10:20] Steve-o
|
or oracle solaris studio
|
[10:20] mikko
|
hehe
|
[10:20] mikko
|
possibly
|
[10:20] Steve-o
|
what's the error? I'll have the machines up tomorrow
|
[10:20] sustrik
|
Steve-o, mikko: can you possibly check whether steve's recent update solved the bool issue with MSVC?
|
[10:20] Steve-o
|
fixed OSX today
|
[10:20] mikko
|
Steve-o: give me a sec
|
[10:20] mikko
|
sustrik_: the daily build seems ok
|
[10:21] sustrik
|
so you are building from opepgm head?
|
[10:21] sustrik
|
rather then embeedded openpgm in 0mq?
|
[10:21] mikko
|
Steve-o: https://gist.github.com/edeacc2d3214fc6b11df
|
[10:22] mikko
|
sustrik_: not quite
|
[10:22] mikko
|
sustrik_: it's not against head but it's using openpgm different than one shipped with zeromq
|
[10:22] sustrik
|
i haven't updated the embedded patckage yet...
|
[10:22] sustrik
|
ah
|
[10:22] sustrik
|
so should i upgrade?
|
[10:22] mikko
|
MSVC is difficult about this
|
[10:23] mikko
|
sustrik_: i think we should update as soon as we can iron out the autoconf build to be stable enough
|
[10:23] mikko
|
and do it in same go
|
[10:23] sustrik
|
ok, good
|
[10:23] sustrik
|
i'll wait then
|
[10:23] mikko
|
MSVC people will most likely build using openpgm installer anyway
|
[10:23] mikko
|
but
|
[10:23] mikko
|
i havent updated openpgm on that box for a while
|
[10:24] sustrik
|
the build used to fail few days ago
|
[10:24] sustrik
|
now it builds
|
[10:24] sustrik
|
something must have changed
|
[10:24] mikko
|
hmm
|
[10:24] mikko
|
i wonder if the MSVC is smart enough to do incremental build properly
|
[10:24] sustrik
|
last failed build fed 22nd, 5 am
|
[10:24] mikko
|
as it seems here: https://github.com/zeromq/zeromq2/commit/43e8868875e1d5287979e5b9060a9b16be45cc79
|
[10:25] mikko
|
unrelated changes but it touched the files
|
[10:25] sustrik
|
could that be the reason?
|
[10:26] sustrik
|
it would make sense then
|
[10:26] Steve-o
|
mikko: OK its a typo, should be lock; not lockl;
|
[10:26] Steve-o
|
atomic.h line 59
|
[10:27] mikko
|
sustrik_: i'll make sure that msvc build wipes out all files and builds clean every time
|
[10:27] sustrik
|
strange then
|
[10:28] sustrik
|
anyway, it builds now
|
[10:28] sustrik
|
so let's not worry about it
|
[10:31] Steve-o
|
mikko: updated trunk with fix
|
[10:31] mikko
|
Steve-o: thanks
|
[10:31] mikko
|
Steve-o: i'll test x86 solaris soon
|
[10:31] Steve-o
|
Is there a simply autoconf to detect whether non-32bit aligned pointers work?
|
[10:31] mikko
|
Steve-o: https://build.zero.mq/job/ZeroMQ2-core-master_MSVC-win7/194/console
|
[10:32] mikko
|
Steve-o: AC_TRY_RUN maybe
|
[10:32] Steve-o
|
Windows has ticket locks enabled, but only in CMake not in Autoconf,
|
[10:33] mikko
|
Steve-o: building with /t:rebuild now
|
[10:33] mikko
|
ermm, sustrik_
|
[10:36] mikko
|
sustrik_: https://build.zero.mq/job/ZeroMQ2-core-master_MSVC-win7/194/console
|
[10:36] mikko
|
builds
|
[10:36] sustrik
|
yup
|
[10:37] sustrik
|
i was just puzzled how it happened to start building given there was no related change to the code
|
[10:37] sustrik
|
anyway, not an issue
|
[10:37] mikko
|
i think msbuild is not very smart about incremental builds
|
[10:37] sustrik
|
ack
|
[10:37] mikko
|
it now does a clean build each time
|
[10:38] mikko
|
Steve-o: builds clean now
|
[10:39] ianbarber
|
Steve-o: while you're here - you know that issue I sent you some code on? Dr Bob has hit it on the mailing list as well.
|
[10:39] ianbarber
|
It looks like it occurs because 0MQ is erroneously setting PGM_SEND_ONLY on both
|
[10:40] Steve-o
|
ianbarber: that would make sense consdering the error message
|
[10:40] ianbarber
|
but I'm not sure if there anything that needs to be set instead of that
|
[10:40] sustrik
|
ianbarber: i think it has to do with recent changes to 0mq codebase
|
[10:40] sustrik
|
let me check
|
[10:40] Steve-o
|
the code is still on my todo list
|
[10:41] ianbarber
|
sustrik_: i think what's needed is to use something other than requires_in and requires_out in pgm_socket
|
[10:41] sustrik
|
exactly
|
[10:41] sustrik
|
it should check socket type instead
|
[10:44] ianbarber
|
just trying if (options.type == ZMQ_XPUB || options.type == ZMQ_PUB) {
|
[10:44] ianbarber
|
in connect_session
|
[10:46] sustrik
|
just did the same myself :)
|
[10:46] sustrik
|
but i cannot test it
|
[10:46] sustrik
|
so let me know how it goes
|
[10:46] ianbarber
|
will do
|
[10:46] ianbarber
|
my little macbook air take a while to run make :)
|
[10:48] ianbarber
|
seems to be working!
|
[10:48] ianbarber
|
Steve-o: i am still seeing quite a few of these: Trace: Recv again on not-full
|
[10:48] ianbarber
|
Trace: Discarded packet for muted receiver.
|
[10:48] ianbarber
|
but that may be because i have both bits running on one box at the moment
|
[10:48] Steve-o
|
that will happen with multicast loop
|
[10:49] ianbarber
|
cool
|
[10:49] sustrik
|
ianbarber: ok, so let me apply the patch
|
[10:49] Steve-o
|
Linux routing for multicast is a bit surprising
|
[10:50] CIA-21
|
zeromq2: 03Martin Sustrik 07master * r29e0e7d 10/ src/connect_session.cpp :
|
[10:50] CIA-21
|
zeromq2: Incorrect PGM sender/receiver creation fixed
|
[10:50] CIA-21
|
zeromq2: Signed-off-by: Martin Sustrik <sustrik@250bpm.com> - http://bit.ly/gaABJu
|
[10:51] ianbarber
|
lovely, thanks sustrik & Steve-o !
|
[10:51] pieterh
|
mikko: I've updated apisite to handle the new 2.1 repo structure
|
[10:52] sustrik
|
pieterh: first patch to backport!
|
[10:52] pieterh
|
hehe
|
[10:52] sustrik
|
how do i send a pull request?
|
[10:52] pieterh
|
that was fast
|
[10:52] pieterh
|
go to the zeromq2-1 repo and create a pull request interactively, I assume...
|
[10:52] mikko
|
pieterh: do i need to change the daily builds?
|
[10:53] pieterh
|
mikko, yes, there is a new repo at zeromq2-1 with a master branch
|
[10:53] pieterh
|
you're also building 2.0.10 I assume?
|
[10:53] mikko
|
well, im building branches master and maint
|
[10:53] pieterh
|
we'll break that off into a separate zeromq2-0 repo once this works
|
[10:53] pieterh
|
so it'll be consistent
|
[10:54] mikko
|
why do we use separate repos for this?
|
[10:54] pieterh
|
ah, long discussion
|
[10:54] pieterh
|
basically it's the simplest for a safe division of labour
|
[10:54] pieterh
|
we need more than master vs. maint
|
[10:54] pieterh
|
we need at least three branches and there's no way to do that in the current structure
|
[10:55] pieterh
|
otherwise we have no way of maintaining both 2.0.10 and 2.1.1
|
[10:55] pieterh
|
yet we have production users on both those
|
[10:55] sustrik
|
pieterh: i can't
|
[10:55] sustrik
|
github doesn't recognise your repo as a fork of zeromq2
|
[10:55] sustrik
|
and thus doesn't allow me send a pull request
|
[10:55] pieterh
|
sustrik: ah github insanity
|
[10:55] sustrik
|
check it out yourself
|
[10:56] pieterh
|
does not allow a fork of a repo within the same organization!
|
[10:56] pieterh
|
simply not allowed
|
[10:56] ianbarber
|
you should be able to add the master as an upstream repo and pull commits from there manually i guess?
|
[10:56] pieterh
|
and I think using a separate organization would be wrong
|
[10:56] sustrik
|
why not create the fork under imatix then?
|
[10:56] mikko
|
pieterh: i mean why didn't you create new branches inside the same repo?
|
[10:57] pieterh
|
mikko: old argument over branching in git repo
|
[10:57] pieterh
|
the only resolution to that was to not touch each others' repos
|
[10:57] pieterh
|
I'd have preferred one repo with X branches, one per release, that could be maintained
|
[10:57] pieterh
|
I proposed this about six months ago
|
[10:57] pieterh
|
and was quite loudly told to stfu, it's not the git way
|
[10:58] pieterh
|
so we ended up with master / maint
|
[10:58] pieterh
|
but that does not let us maintain more than one stable release
|
[10:58] pieterh
|
so we're stuck between freezing 2.0.x or not stabilizing 2.1.x
|
[10:58] pieterh
|
hence the current solution
|
[10:59] pieterh
|
sustrik: yes, I can fork the repo in imatix but if it's a fork IMO we have other issues
|
[10:59] pieterh
|
forget pull requests then, it's not the only option
|
[10:59] pieterh
|
forking would mean I essentially track your master branch
|
[10:59] pieterh
|
which is not what we want at all
|
[10:59] mikko
|
the natural problem with this are the tools. not being able to switch between branches in one clone but rather the need to clone two repos
|
[11:00] pieterh
|
mikko: I think the best answer here is that one repo should 'belong' to as few people as possible
|
[11:00] pieterh
|
and it should be up to those people how they work
|
[11:00] pieterh
|
full freedom inside your own repo
|
[11:00] sustrik
|
+1
|
[11:00] pieterh
|
indeed
|
[11:00] pieterh
|
sustrik and I agree violently on this
|
[11:00] pieterh
|
it's the only scalable approach we can see
|
[11:01] pieterh
|
bitter experience tells us when we make a git structure complex, bad things happen
|
[11:01] pieterh
|
sustrik: what I suggest are patches and issues, it's very close to pull requests
|
[11:01] pieterh
|
i.e. open an issue and copy/paste the patch
|
[11:02] pieterh
|
alternatively I can pull commits from your repository if I know the commit number
|
[11:02] pieterh
|
as ianbarber says
|
[11:02] mikko
|
pieterh: i disagree with both of you biolently
|
[11:02] pieterh
|
:-)
|
[11:02] mikko
|
violently
|
[11:02] pieterh
|
mikko: then you can go create your own repo and do it your way
|
[11:02] mikko
|
you can organise your local development with git as you please
|
[11:02] pieterh
|
that's the beauty of this approach
|
[11:02] mikko
|
have branches in a way you want
|
[11:03] mikko
|
just making sure that release branches contain what they need to contain
|
[11:03] pieterh
|
well, yes, except trying to make that work in the development git failed
|
[11:03] mikko
|
and now we are creating issues and copy-pasting patches in to them?
|
[11:03] sustrik
|
ok, i'll send you commit numbers
|
[11:04] pieterh
|
mikko: where were you when I was being assaulted for proposing release branches?
|
[11:04] pieterh
|
it is too late now, really
|
[11:04] mikko
|
pieterh: i didn't really interact with the community much back then
|
[11:04] mikko
|
you were too ahead of time :)
|
[11:04] pieterh
|
the tank wheels of history have crushed that opportunity
|
[11:05] pieterh
|
sp|ke: hello
|
[11:06] pieterh
|
ianbarber: can you point me to instructions on adding upstream repositories? or is it simple?
|
[11:06] pieterh
|
sustrik: I'll try with that patch, see how it goes
|
[11:06] mikko
|
pieterh: git remote add upstream <uri>
|
[11:07] mikko
|
upstream being arbitrary name
|
[11:07] pieterh
|
thx!
|
[11:07] mikko
|
then git fetch upstream
|
[11:07] mikko
|
you might like this: http://beebo.org/stuff/tech/git.html
|
[11:07] pieterh
|
git fetch upstream does what exactly?
|
[11:07] pieterh
|
sorry, will rtfm but git demands too much space in my brain
|
[11:08] mikko
|
it fetches the information about the remote repository
|
[11:08] mikko
|
so that you know what branches it contains etc
|
[11:08] pieterh
|
let's say I want to pull one commit from Martin's repo
|
[11:08] pieterh
|
<uri> is the read-only git uri, fine
|
[11:09] mikko
|
pieterh: you might have a lot easier time with https://github.com/defunkt/github-gem
|
[11:09] mikko
|
it abstracts some of this normal github workflow
|
[11:09] pieterh
|
well, let's try with the basics
|
[11:10] pieterh
|
I add the upstream remote, then fetch that
|
[11:10] pieterh
|
next?
|
[11:10] pieterh
|
sp|ke: it happens automatically
|
[11:10] ianbarber
|
git cherry-pick
|
[11:10] pieterh
|
sp|ke: have you read the Guide yet?
|
[11:11] pieterh
|
ianbarber: so "git cherry-pick upstream/<commit>"?
|
[11:11] ianbarber
|
git cherry-pick <biglong-git-id-of-the-commit>
|
[11:12] pieterh
|
ph@ws200901:~/work/zeromq2-1$ git cherry-pick 29e0e7d
|
[11:12] pieterh
|
Finished one cherry-pick.
|
[11:12] pieterh
|
[master f4c1040] Incorrect PGM sender/receiver creation fixed
|
[11:12] pieterh
|
Author: Martin Sustrik <sustrik@250bpm.com>
|
[11:12] pieterh
|
1 files changed, 2 insertions(+), 2 deletions(-)
|
[11:12] pieterh
|
yay!
|
[11:12] pieterh
|
sustrik: okay, we have liftoff
|
[11:13] ianbarber
|
woo
|
[11:13] pieterh
|
sustrik: could you create issues with commit tags like this then?
|
[11:13] pieterh
|
that allows public discussion & review of exactly what goes into stable releases
|
[11:14] pieterh
|
IMO you can link directly to the commit, e.g. https://github.com/zeromq/zeromq2/commit/29e0e7dbadfcd0bab70feee119bd7c5e623b38d4
|
[11:15] mikko
|
pieterh: you can also do the following in your local repo:
|
[11:15] mikko
|
git checkout -b mrsustrik <localname>/<branchname>
|
[11:16] mikko
|
that creates a local branch that tracks for example upstream/master
|
[11:16] mikko
|
so when you do for example git diff in that branch it's against upstream/master rather than origin/master
|
[11:16] mikko
|
also, you can diff directly against remote one by doing "git diff upstream/master"
|
[11:17] pieterh
|
git is too powerful
|
[11:17] pieterh
|
all the stuff I *can* do
|
[11:18] pieterh
|
mikko: regarding your violent objection to separate repos
|
[11:18] pieterh
|
our main goals are (a) independence for teams and (b) simplicity
|
[11:19] pieterh
|
our experience is that trying to do too much in one git leads to arguments and complexity
|
[11:20] mikko
|
but think about the following:
|
[11:20] mikko
|
if we are moving to pull requests the development doesnt have to happen in zeromq/zeromq
|
[11:20] mikko
|
we can open pull requests
|
[11:20] mikko
|
cherry-pick
|
[11:20] mikko
|
etc
|
[11:20] pieterh
|
indeed
|
[11:20] pieterh
|
that's already happening to some extent
|
[11:20] mikko
|
this allows everyone to organise their fork as they please
|
[11:20] pieterh
|
indeed
|
[11:21] mikko
|
but still keeps upstream in one place
|
[11:21] pieterh
|
exactly
|
[11:21] mikko
|
which is better for tools such as jenkins
|
[11:21] mikko
|
and not just that
|
[11:22] mikko
|
not having to remember what _repo_ contains what
|
[11:22] pieterh
|
each repository has its target audience, tradeoffs
|
[11:22] pieterh
|
and someone who aims to make that particular mix work
|
[11:22] mikko
|
rather once you've cloned a repo you have it all
|
[11:22] mikko
|
pieterh: what do you mean by that>
|
[11:22] pieterh
|
well, e.g. some people will want multiple release branches in one repo
|
[11:22] pieterh
|
because they want to manage it all centrally (or by themselves)
|
[11:22] mikko
|
what i'm saying is distribute the development rather than the results of it
|
[11:23] pieterh
|
others will want to delegate that work as far as possible
|
[11:23] mikko
|
have multiple forks of one integration point rather than multiple integrations points for the same thing
|
[11:23] pieterh
|
your long words hurt my brain
|
[11:23] pieterh
|
we only have one integration point, which is basically Martin Sustrik
|
[11:23] mikko
|
i'm a consultant by trade
|
[11:23] mikko
|
:)
|
[11:24] pieterh
|
people are integration points, remember
|
[11:24] pieterh
|
not repositories
|
[11:24] pieterh
|
each project really has one brain at its center IME
|
[11:24] pieterh
|
now, MS doesn't want to be bothered creating stable releases of 0MQ
|
[11:25] pieterh
|
so that work becomes another project, with another brain
|
[11:25] pieterh
|
and thus, naturally, another repository
|
[11:25] pieterh
|
i'm not sure how this translates to development of the *core* library
|
[11:26] pieterh
|
if you work in terms of layers then it does scale
|
[11:26] pieterh
|
OpenPGM has its own repository and its own brain (Hi Steve-o!)
|
[11:26] pieterh
|
I'm a software architect by trade
|
[11:27] mikko
|
i'm still failing to see the benefit of separate repository
|
[11:27] mikko
|
i can see the reasoning with one repo per project
|
[11:27] mikko
|
but what you are describing as a project clashes with my concept of a project
|
[11:28] mikko
|
following that logic i should have a repo for build fixes
|
[11:28] mikko
|
steve should have a repo for pgm related fixes
|
[11:28] pieterh
|
it is always about the people
|
[11:29] pieterh
|
and how much complexity they can tolerate
|
[11:29] pieterh
|
"stable version of 0MQ/2.1" has become a "project"
|
[11:29] pieterh
|
by making that decision we remove a chunk of complexity from "development of 0MQ" as a project
|
[11:30] sustrik
|
i would formulate it this way: does linux have a RHEL5 branch in his vanilla repo?
|
[11:30] ianbarber
|
i don't think the two of you are that far off really. Mikko: there is one integration point, the master on zeromq, all of the patches for the other projects come from there
|
[11:30] sustrik
|
no, they are separate projects
|
[11:30] pieterh
|
indeed, it's a tool we're using to reduce complexity
|
[11:30] ianbarber
|
yeah, but is the idea not to push patches into development, then cherry pick ones out to the stable release projects
|
[11:31] pieterh
|
yes
|
[11:31] pieterh
|
when possible and sane
|
[11:31] pieterh
|
but that is a different question, about the trust relationships between these people and projects
|
[11:32] pieterh
|
does sustrik trust a patch coming from me?
|
[11:32] pieterh
|
inherently, no
|
[11:32] pieterh
|
would I trust a patch coming from sustrik?
|
[11:32] pieterh
|
totally
|
[11:32] pieterh
|
take one of my projects, like ZFL, and its the other way around of course
|
[11:33] sustrik
|
i think the problem for contributors is where to submit the patches now
|
[11:33] mikko
|
pieterh: if im using zeromq 2.1 and i have an issue with it where do i raise the issue?
|
[11:34] pieterh
|
anywhere you think it'll get attention
|
[11:34] pieterh
|
you're speaking to a person, remember
|
[11:34] pieterh
|
who do you talk to, the guy who provided you 2.1 or the guy who you think really knows the issue?
|
[11:34] pieterh
|
classic question of workflow
|
[11:34] mikko
|
i dont know
|
[11:34] mikko
|
im a user
|
[11:34] pieterh
|
if you are Mikko, of course you talk to the developer
|
[11:34] mikko
|
i got a problem and i see this list of github projects
|
[11:34] pieterh
|
if you're a user, you talk to the packager
|
[11:34] pieterh
|
nothing new here
|
[11:35] pieterh
|
if I find a bug in Linux, I immediately pick up the phone and call my buddy Linus
|
[11:35] pieterh
|
I'd expect the 2-1 repo to be fairly low-key
|
[11:36] pieterh
|
it does not require visibility, really
|
[11:36] pieterh
|
but perhaps it does
|
[11:46] Steve-o
|
mikko: I have a patch ready to enabled ticket spinlocks in Autoconf, need some testing first
|
[11:47] mikko
|
Steve-o: do you want me to test something?
|
[11:47] mikko
|
running pkg-get upgrade on the solaris box atm
|
[11:48] Steve-o
|
I'm waiting for some files to copy over wifi, then I can reboot & test on Linux, then I'll update trunk
|
[11:49] Steve-o
|
I'm trying to avoid SPARC & Sun Pro x86 assembler
|
[11:50] Steve-o
|
I don't have the brain cells to convert my GCC or MSVC assembler to Sun Pro friendly constraints
|
[11:57] pieterh
|
sustrik: ok, release made, announced, API website updated, etc.
|
[12:00] pieterh
|
sustrik_: I've used the terminology "2.1 stable (rc1)" so that more people will use this package
|
[12:01] mikko
|
pieterh: remember to tweet as well
|
[12:01] pieterh
|
ah, of course
|
[12:01] pieterh
|
everyone should tweet
|
[12:01] pieterh
|
the more the merrier
|
[12:05] ianbarber
|
what are the known instabilities with PGM? just the rate issue?
|
[12:06] pieterh
|
yup
|
[12:06] pieterh
|
plus there are several unknown instabilities
|
[12:06] guido_g
|
like the one I reported to the ml
|
[12:07] pieterh
|
until people have severely hammered 0MQ+OpenPGM5, I'd expect there to be issues
|
[12:07] mikko
|
would we benefit from something like trac or redmine?
|
[12:08] mikko
|
very lightweight project stuff
|
[12:11] Steve-o
|
Ok just enabled ticket spinlocks in trunk, hopefully SunPro on Linux doesn't choke on it
|
[12:12] mikko
|
Steve-o: will test now
|
[12:12] sustrik
|
pieterh_: congrats!
|
[12:13] sustrik
|
do you want access to freshmeat 0mq account?
|
[12:13] pieterh
|
sustrik: I'm now splitting the old procedures page into two, Development and Releases
|
[12:13] pieterh
|
ah, probably best to use that account, right?
|
[12:13] pieterh
|
since it's already configured
|
[12:14] pieterh
|
ack, send me the access details by private IRC
|
[13:21] sustrik
|
pieterh: development subsection under development section in the left pane is strange
|
[13:22] sustrik
|
it's about building 0mq rather than development
|
[13:22] pieterh
|
sustrik: try now?
|
[13:23] pieterh
|
uhm, hang on...
|
[13:24] pieterh
|
Development is not accurate, it's more... can I use "Activity" or something for the nav title?
|
[13:27] sustrik
|
it's build
|
[13:27] sustrik
|
imo
|
[13:28] sustrik
|
or just call it source
|
[13:29] sustrik
|
as before
|
[13:39] sustrik
|
other way round!
|
[13:39] sustrik
|
section Development
|
[13:40] sustrik
|
subsection Source
|
[13:40] sustrik
|
build notes on source page kind of make sense
|
[13:42] pieterh
|
What do we call the main repository?
|
[13:43] pieterh
|
if it's the "development" repository, that's the page title
|
[13:43] pieterh
|
development of 0MQ does not cover bindings, releases, etc. afaics
|
[13:43] sustrik
|
its master
|
[13:44] pieterh
|
master is a branch
|
[13:44] sustrik
|
vanilla?
|
[13:44] pieterh
|
meaningless name is meaningless
|
[13:44] sustrik
|
whatever
|
[13:44] sustrik
|
but the caption is misleading
|
[13:44] pieterh
|
well, I need a term that sticks, this is vital
|
[13:45] pieterh
|
i can fix the caption afterwards
|
[13:45] pieterh
|
upstream?
|
[13:45] pieterh
|
core?
|
[13:45] pieterh
|
mama?
|
[13:45] sustrik
|
The section should be called "Development"
|
[13:45] sustrik
|
so that only people who are interested in development look at the subsections
|
[13:45] pieterh
|
ok... brilliant idea...
|
[13:46] pieterh
|
"0MQ/2.2" as title
|
[13:46] pieterh
|
"0MQ/2.1"
|
[13:46] sustrik
|
and the first thing you want to do when developing is getting the source
|
[13:46] pieterh
|
"0MQ/3.0"
|
[13:46] pieterh
|
yes
|
[13:46] sustrik
|
so the subsection should be called source
|
[13:46] pieterh
|
the 'source repo'?
|
[13:46] pieterh
|
could work
|
[13:46] sustrik
|
up to you
|
[13:46] sustrik
|
source is somewhat shorter
|
[13:46] pieterh
|
god no, I need buyin :-)
|
[13:46] pieterh
|
ok, let's try that
|
[13:50] Guthur
|
is this a new look website you guys are talking about
|
[13:56] pieterh
|
Guthur: which one?
|
[13:57] pieterh
|
sustrik: I'm going to deprecate the part about emailing patches to the list, and start mentioning github pull requests, is that OK?
|
[13:57] Guthur
|
pieterh: whatever you guys are debating
|
[13:58] pieterh
|
Oh, this is about contributions & releases
|
[14:08] sustrik
|
leave the part about ML there
|
[14:08] pieterh
|
yes, I'm making it a collapsible section
|
[14:13] bijuC
|
Hello.. I am facing something strange with 2.1.0 and 2.1.1 MSVC RELEASE builds
|
[14:14] mikko
|
bijuC: whats the problem?
|
[14:14] bijuC
|
The DEBUG build works fine .. I have a PUB app and a SUB app.. i am able to receive everything at SUB
|
[14:15] bijuC
|
but for a RELEASE build .. there are no errors.. but i receive nothing on SUB.. I used Wireshark to check but seems like there is no transmission
|
[14:16] sustrik
|
pieterh: i'm not accepting pull requests, the process on the site would mislead people
|
[14:16] pieterh
|
oh... sorry
|
[14:16] pieterh
|
I misremembered
|
[14:16] sustrik
|
np
|
[14:17] pieterh
|
hmm... I hope over time you'll be convinced, it really is simpler for everyone
|
[14:18] bijuC
|
I am trying both tcp and epgm.. both exhibit the same behaviour
|
[14:18] sustrik
|
i want to process to be as widely usable as possible
|
[14:18] sustrik
|
pull requests depend on:
|
[14:19] sustrik
|
1. contributor having git and understanding it
|
[14:19] sustrik
|
2. github being online
|
[14:19] mikko
|
Steve-o: problem
|
[14:19] Steve-o
|
shoot
|
[14:20] sustrik
|
i also want to have the log of patch discussions in my mailbox
|
[14:20] sustrik
|
rather than hosted in a third-party, non-exportable archive
|
[14:20] mikko
|
Steve-o: the constants defined by autconf build dont have CONFIG_ prefix
|
[14:20] mikko
|
which a lot of openpgm expects
|
[14:21] pieterh
|
sustrik: well, I think 90% of people on the list were for pull requests
|
[14:21] mikko
|
got this on solaris:
|
[14:21] mikko
|
-DHAVE_GETTIMEOFDAY=1 defined in build
|
[14:21] pieterh
|
but each maintainer is free to choose, obvously
|
[14:21] mikko
|
"time.c", line 412: #error: "gettimeofday() or ftime() required to calculate counter offset"
|
[14:21] sustrik
|
sure
|
[14:21] sustrik
|
it's just me
|
[14:21] pieterh
|
I've changed that page, it was important to me to have a documented policy for other projects as well
|
[14:21] Steve-o
|
mikko: all the autoconf defined ones are ignored currently
|
[14:21] mikko
|
the constant that it checks for is CONFIG_HAVE_GETTIMEOFDAY
|
[14:21] mikko
|
Steve-o: so i'll hack that specific case
|
[14:22] Steve-o
|
mikko: it basically sets up the environment to match scons
|
[14:22] pieterh
|
sustrik: your point #1, either way people need to learn git, and formatted patches is harder, not easier
|
[14:22] pieterh
|
so it's just down to not trusting github
|
[14:22] pieterh
|
but if it's offline, nothing works either
|
[14:22] sustrik
|
nope, there are platforms with no git
|
[14:22] pieterh
|
?
|
[14:23] pieterh
|
the documented process depends on github and git
|
[14:23] sustrik
|
openvms, zos?
|
[14:23] pieterh
|
I'm 100% sure openvms has git
|
[14:23] sustrik
|
it has not
|
[14:23] sustrik
|
brett is working on it
|
[14:23] sustrik
|
but it's not yet ported
|
[14:24] pieterh
|
sure, but the process as already documented does not work then
|
[14:24] pieterh
|
so you are saying "because your process B does not work on OpenVMS, I want to use process A"
|
[14:24] mikko
|
Even VMS seems to have it (although if git is ever ported to VMS, I'll
|
[14:24] mikko
|
just have to shoot myself. I used VMS in -88, and the scars are _still_
|
[14:24] mikko
|
fresh).
|
[14:24] mikko
|
Linus
|
[14:24] pieterh
|
while process A doesnt work on OpenVMS either
|
[14:24] pieterh
|
?
|
[14:24] pieterh
|
weird arguments, el sustrik
|
[14:25] sustrik
|
what i am saying that i am willing to accept simple diffs
|
[14:25] mikko
|
sustrik: github has subversion bridge
|
[14:25] pieterh
|
well, add that
|
[14:25] sustrik
|
if there's good reason for that
|
[14:25] pieterh
|
sure
|
[14:25] sustrik
|
yup
|
[14:25] pieterh
|
if it's not on the page, it's not there
|
[14:25] sustrik
|
mikko: yes, i know
|
[14:25] pieterh
|
what i want is consistency
|
[14:25] pieterh
|
simplicity
|
[14:25] pieterh
|
least surprise
|
[14:25] pieterh
|
that's how we get more contributors
|
[14:26] pieterh
|
pull requests generate more contributions
|
[14:26] pieterh
|
that is my experience and others have reported the same
|
[14:26] pieterh
|
the format patch / email process is a barrier
|
[14:27] pieterh
|
just compare the complexity of the two explanations on that page
|
[14:28] pieterh
|
it requires (and I just counted) 3x more words to describe the email/format patch process
|
[14:29] pieterh
|
mikko: you did not _like_ VMS?
|
[14:29] mikko
|
pieterh: that was linus torvalds
|
[14:30] pieterh
|
ah :-)
|
[14:30] mikko
|
i was googling out of interest what is hindering git on openvms
|
[14:30] sustrik
|
done
|
[14:35] pieterh
|
okaaay... finished documenting all the new process
|
[14:37] nooob
|
what is a good way to integrate zmq with http
|
[14:38] bijuC
|
Any inputs on the MSVC RELEASE issue??
|
[14:39] mikko
|
bijuC: i can give it a spin a bit later
|
[14:39] pieterh
|
nooob: make a bridge that speaks HTTP at one side and 0MQ at the other
|
[14:40] mikko
|
bijuC: just in case, can you show the code?
|
[14:40] pieterh
|
nooob: this is kind of what mongrel2 does
|
[14:40] bijuC
|
ohkk.. later today??
|
[14:40] nooob
|
how good is mongrel2?
|
[14:41] bijuC
|
yeah.. i can show the snippet..lemme know where n how.. but i wonder why release build only gives an issue..
|
[14:42] pieterh
|
mikko: I've tried to summarize the arguments for/against multiple repos here: http://www.zeromq.org/topics:release-process
|
[14:42] pieterh
|
sustrik: whenever you want, I'll spinoff 2.0 into its own git, the process is now documented and clear
|
[14:49] mikko
|
pieterh: "It is easy to cherry-pick changes from the source git to release gits."
|
[14:50] pieterh
|
yes
|
[14:50] mikko
|
interesting that it's only argument for different repos
|
[14:50] pieterh
|
uhm, there are like four arguments...
|
[14:50] mikko
|
"it is easy to cherry-pick changes from the master branch to release branch"
|
[14:51] pieterh
|
you misunderstand
|
[14:51] pieterh
|
"separate gits are extra work"
|
[14:51] pieterh
|
"no, it's easy"
|
[14:51] mikko
|
ah ok
|
[14:51] pieterh
|
I'll clarify
|
[14:51] mikko
|
i guess what ever you find easiest
|
[14:52] pieterh
|
we votes with our feets
|
[14:52] pieterh
|
honestly, making this 2.1.1 release was utterly easy
|
[14:52] pieterh
|
I'd like to compress our release cycles significantly, it's been my complaint for a year or so
|
[14:52] pieterh
|
has taken waay too long to get code into peoples' hands
|
[14:58] bijuC
|
Sorry got disconnected
|
[15:03] bijuC
|
I am also looking for clarity as to how openPGM is linked into 2.1.1 for MSVC
|
[15:03] bijuC
|
I see #include <pgm/pgm.h> in pgm_socket
|
[15:04] bijuC
|
But the libzmq does not include the openPGm dir nor does it link against libpgm.lib
|
[15:04] bijuC
|
Just curious how it is done
|
[15:08] mikko
|
bijuC: i use separate installer
|
[15:08] mikko
|
http://snapshot.zero.mq/msvc2008/
|
[15:08] mikko
|
you can try the snapshot dlls
|
[15:08] mikko
|
they should be built with openpgm
|
[15:11] bijuC
|
Sorry Mikko, do you mean separate installer for zmq or openPGM
|
[15:11] bijuC
|
You want me to try the dll/libs in the link you gave .. to see if the RELEASE build issue persists?
|
[15:12] mikko
|
bijuC: im saying that this link contains dll built with openpgm
|
[15:12] mikko
|
statically linked openpgm
|
[15:12] mikko
|
give it a spin whether the issue persists with that version as well
|
[15:12] bijuC
|
oh ok
|
[15:12] bijuC
|
give me 5 mins
|
[15:12] mikko
|
thats the latest master branch
|
[15:13] mikko
|
can you also put the code to gist.github.com
|
[15:14] cremes
|
glad to hear the christchurch guy came through the earthquake relatively unscathed
|
[15:37] andrewvc
|
cremes: good morning
|
[15:38] cremes
|
andrewvc: good morning
|
[15:38] cremes
|
i saw your notes; that is an incredible find
|
[15:38] andrewvc
|
yeah, I mean, I knew it had to be something simple and stupid lol
|
[15:38] cremes
|
i was looking over my transcript from last night and i said (paraphrased)
|
[15:38] cremes
|
"there's no way 0mq is exposing an internal socketpair"
|
[15:38] cremes
|
:)
|
[15:38] andrewvc
|
lol
|
[15:39] cremes
|
any ideas on how to fix 0mq?
|
[15:39] andrewvc
|
yeah
|
[15:39] andrewvc
|
well, give EM a 'nevershutdownthisFD' flag
|
[15:39] andrewvc
|
is one
|
[15:39] cremes
|
that just masks the problem
|
[15:39] andrewvc
|
I don't think its possible
|
[15:40] andrewvc
|
aside from writing a kernel patch
|
[15:40] cremes
|
i think this deserves a 0mq fix so that *no* language binding can screw it up
|
[15:40] andrewvc
|
ZMQ wants a fake FD, but those don't exist
|
[15:40] cremes
|
well, perhaps a flag in the 0mq code that uses that FD
|
[15:40] cremes
|
it may have to check that it's still valid every time before using it
|
[15:40] andrewvc
|
yeah, I mean
|
[15:41] cremes
|
if 0mq is going to expose that *and* external code can close this "internal" socketpair, then it has to be defensive
|
[15:41] andrewvc
|
you could do that, but it'd still suck, because then you'd have to either open a new FD automatically
|
[15:41] andrewvc
|
which could cause tons of its own issues
|
[15:41] andrewvc
|
or just let the socket die, but not be exceptional
|
[15:41] andrewvc
|
I guess you could make a new error code eh?
|
[15:41] cremes
|
doesn't ETERM capture the issue well enough?
|
[15:43] andrewvc
|
yeah, you could probably trigger it here: https://github.com/zeromq/zeromq2/blob/master/src/mailbox.cpp#L204
|
[15:43] andrewvc
|
add a condition for EBADF
|
[15:43] cremes
|
yeah, that has potential
|
[15:44] ptrb
|
pyzmq requires python... what? 2.5?
|
[15:44] andrewvc
|
I mean, it's definitely something someone will hit in the future
|
[15:45] andrewvc
|
if you stick ZMQ_FD into an existing poller, there's a good chance you automatically will call close on it
|
[15:45] andrewvc
|
or whatever
|
[15:46] andrewvc
|
so, my other thought, is just to add this to EM instead of using Kernel#epoll EM::PeriodicTimer.new(0.1) { #zmq_poll stuff}
|
[15:47] andrewvc
|
which I'm not sure would really have any perf ramifications, I mean, 2 calls to a poller vs. 1 prolly wouldn't kill performance right?
|
[15:47] andrewvc
|
though it isn't great
|
[15:55] cremes
|
andrewvc: i don't think using a periodic timer is appropriate for this
|
[15:55] cremes
|
i think it needs a genuine fix in 0mq
|
[15:55] andrewvc
|
agreed
|
[15:56] andrewvc
|
I guess I'll post it on the list
|
[15:56] cremes
|
still, an awesome find; i see from the irc backbuffer that sustrik already knows about it
|
[15:56] andrewvc
|
hehe, yeah. This would have been really easy to find btw, if I'd known about a trick evanphx showed me
|
[15:56] andrewvc
|
well, didn't show me, we never used ti
|
[15:56] andrewvc
|
but told me about
|
[15:56] cremes
|
??
|
[15:57] andrewvc
|
apparently dtrace can be set to give a GDB back trace for each syscall
|
[15:57] andrewvc
|
I'm not sure if strace does that, but it would have showed who was calling close()
|
[15:57] cremes
|
is that an osx-only trick?
|
[15:57] andrewvc
|
perhaps, I didn't really look into it
|
[15:57] cremes
|
that's very cool; i'll have to research more details
|
[15:58] andrewvc
|
yeah, definitely
|
[16:14] andrewvc
|
cremes: so the other aspect of it is, there's likely a bug in my code
|
[16:14] andrewvc
|
but it's not my fault :) somehow I'm triggering an exception, and EM is catching and dealing with it
|
[16:14] andrewvc
|
but I still call a getsockopt
|
[16:15] andrewvc
|
which is odd because I never use threads
|
[16:15] andrewvc
|
since I'm in EM
|
[16:15] andrewvc
|
could be something in the internals of EM I guess
|
[16:15] cremes
|
try wrapping your code in begin/rescue/end and see if you can catch it before EM does
|
[16:15] sustrik
|
andrewvc: the problem with ruby binding
|
[16:15] sustrik
|
we've spoke about
|
[16:16] sustrik
|
is it fixed?
|
[16:16] andrewvc
|
hehe, hey I just woke up :)
|
[16:16] sustrik
|
just asking
|
[16:16] andrewvc
|
I'll take a look at it later for sure though
|
[16:16] andrewvc
|
yeah, no worries :)
|
[16:16] cremes
|
isn't this a problem that *all* language bindings have?
|
[16:17] sustrik
|
no idea
|
[16:17] cremes
|
andrewvc: ??
|
[16:17] sustrik
|
it's ruby-ffi, right?
|
[16:18] andrewvc
|
lemme look at it again, I was so tired last night, I don't recall what it was
|
[16:18] cremes
|
if the binding can close the FD of the internal socketpair and 0mq doesn't catch it, that's a general problem
|
[16:18] andrewvc
|
ohhh, we're talking about two separate issues
|
[16:18] sustrik
|
ah
|
[16:18] cremes
|
ok, ignore me then :)
|
[16:18] andrewvc
|
so, one issue is this
|
[16:18] cremes
|
andrewvc: i'll let you do the talking
|
[16:18] sustrik
|
i meant be BADFD one
|
[16:18] andrewvc
|
oh, so it is the same one
|
[16:18] andrewvc
|
well, it's not actually the binding
|
[16:19] andrewvc
|
it's eventmachine, the ruby reactor library
|
[16:19] andrewvc
|
it assumes on hitting an exceptional state that you want to close all the FDs
|
[16:19] sustrik
|
aha
|
[16:19] andrewvc
|
which is its fault
|
[16:19] andrewvc
|
but maybe zmq should check for EBADF
|
[16:19] andrewvc
|
in mailbox.cpp
|
[16:19] andrewvc
|
and print a friendly error message
|
[16:19] sustrik
|
there's one issue in 0mq core bug trakcer
|
[16:19] andrewvc
|
like hey, you shutdown an internal FD I think, don't do that
|
[16:19] sustrik
|
https://github.com/zeromq/zeromq2/issues#issue/166
|
[16:19] sustrik
|
which looks similar
|
[16:20] sustrik
|
can it be possibly related?
|
[16:21] andrewvc
|
one sec, brb
|
[16:24] andrewvc
|
sustrik: cremes: I don't think that's related
|
[16:25] andrewvc
|
I don't see him using ZMQ::FD
|
[16:26] andrewvc
|
I'm about to post to the ML, with a better breakdown of what happened / better organized thoughts
|
[16:41] cremes
|
sustrik: what do you think of: https://github.com/zeromq/zeromq2/issues#issue/171
|
[17:11] jfkw
|
Newb to message-queueing, interested in 0mq, wondered if the following is in scope:
|
[17:12] jfkw
|
I have a remote-site BerkelyDB with a high write rate. Need to replicate back to the home office over a not-fast, sometimes unavailable link.
|
[17:14] jfkw
|
Does 0mq support reliable in-order delivery and compression of data over the wire? Remote site runs Windows XP, home office will be Linux 2.6+.
|
[17:16] andrewvc
|
jfkw compression, no
|
[17:17] andrewvc
|
in-order yes
|
[17:17] andrewvc
|
as far as unavailable, it should reconnect automatically if using the TCP transport
|
[17:17] andrewvc
|
but I'd test it on your network to be sure
|
[17:18] andrewvc
|
you can compress the data yourself before sending it
|
[17:18] andrewvc
|
since zmq frames messages, each one could be gzipped
|
[17:19] andrewvc
|
as far as DB replication goes though, not sure I'd just trust all that to ZMQ
|
[17:22] pieterh
|
jfkw: have you read the Guide yet?
|
[17:26] jfkw
|
andrewvc: ack on BDB replication being nontrivial, won't expect the ZMQ to do it for us.
|
[17:26] andrewvc
|
cool, yeah I'd really read the guide
|
[17:27] jfkw
|
pieterh: Ah, thanks for suggesting that, I see the link on the first page of Learn now. Missed it the first time through hunting for my specific issue.
|
[17:27] pieterh
|
jfkw: let me make that link more visible
|
[19:00] amacleod
|
I feel like I must somehow be misusing the Java API. I keep getting assertions when I try to terminate contexts.
|
[19:00] amacleod
|
Now it is "nbytes == sizeof (command_t)" in mailbox.cpp:244
|
[19:01] amacleod
|
Using git master pulled yesterday.
|
[19:12] pieterh
|
it shouldn't be possible to crash 0MQ via an API like that... :-/
|
[19:12] amacleod
|
I agree :-/
|
[19:13] pieterh
|
do you have a simple reproducible case?
|
[19:13] amacleod
|
So, what I'm doing now is running, in the same run-time due to the way my test harnesses work, two separate servers, one after the other.
|
[19:13] amacleod
|
pieterh, I do not presently. I might be able to make one.
|
[19:13] pieterh
|
we'd need that
|
[19:13] pieterh
|
an absolute minimal test case
|
[19:14] amacleod
|
Is it possible for lingering data from a previous socket to corrupt a new one that binds to the same port?
|
[19:14] pieterh
|
sorry, this is not fun when you're trying to do other work...
|
[19:14] pieterh
|
amacleod: not if the socket is being kept within one thread
|
[19:14] pieterh
|
are you moving sockets between threads?
|
[19:14] pieterh
|
that is one common cause of 0MQ failures and it obviously can happen in any threaded language
|
[19:14] amacleod
|
pieterh, indeed. Let me see what I can do about creating a minimal test case.
|
[19:15] amacleod
|
pieterh, I'm pretty sure I've gotten it to where I'm never moving sockets between threads.
|
[19:16] pieterh
|
you used to do that?
|
[19:16] amacleod
|
Well, that was the cause of my original problem. I didn't think I was using sockets between threads, but all threads were using a single context to create sockets.
|
[19:18] pieterh
|
that should be safe
|
[19:18] pieterh
|
contexts can be shared
|
[19:18] pieterh
|
if a threads create sockets and don't pass those to other threads, you're not moving sockets between threads
|
[19:19] pieterh
|
even that is safe so long as you do not read/write/close from more than one thread
|
[19:23] amacleod
|
On second look, I think threading may still be my problem. All of the actual reading and writing happens in worker threads, which is fine, but then when the test is complete, it calls the close method of my wrapper object, so close would be called from the main thread. :(
|
[19:39] pieterh
|
amacleod: yeah, that would crash it
|
[19:42] amacleod
|
Threaded concurrency gives me a headache. Give me a single-threaded select/poll event reactor any day.
|
[19:45] cremes
|
amacleod: zmqmachine, but you gotta use ruby ;)
|
[19:45] amacleod
|
cremes, :-D Wish I could.
|
[19:45] amacleod
|
It would be cool if Twisted had 0MQ support too.
|
[19:45] pieterh
|
amacleod: sounds like you're mixing shared-state concurrency with message passing
|
[19:45] pieterh
|
that is usually a terrible idea
|
[19:46] pieterh
|
I've pushed a whitepaper for an open source data plant: http://www.zeromq.org/whitepapers:open-source-data-plant
|
[19:46] amacleod
|
pieterh, I know it is. Maybe I can turn message passing to my advantage.
|
[19:47] pieterh
|
cremes: this might interest you...
|
[19:48] pieterh
|
amacleod: what transport are you using to/from workers?
|
[19:48] pieterh
|
tcp?
|
[19:48] amacleod
|
Well.. I'm using TCP for the client/server communication.
|
[19:48] cremes
|
pieterh: what might interest me?
|
[19:49] pieterh
|
see the line just above
|
[19:49] cremes
|
oh, the data plant thing? i'll look at it...
|
[19:49] amacleod
|
What I'm just now thinking is that maybe I should have the stuff inside the server do like the pattern from your recent example--inproc router to router (or dealer to dealer?).
|
[19:49] pieterh
|
cremes: great
|
[19:50] pieterh
|
amacleod: what example, there are so many...?
|
[19:50] pieterh
|
use inproc when you know the server will always be one process
|
[19:50] pieterh
|
use tcp when you want to in future move workers to other boxes or processes
|
[19:51] amacleod
|
http://zguide.zeromq.org/chapter:all#toc51
|
[19:51] amacleod
|
Part of the trouble is I need to federate disparate connection types.
|
[19:51] pieterh
|
right, so clients and servers talk over tcp, each has their own context, and manage their own sockets
|
[19:52] pieterh
|
that sounds painful!
|
[19:52] amacleod
|
Yeah. Square pegs, round holes. Pain pain pain.
|
[20:01] cremes
|
pieterh: i built a (small) whaleshark and didn't even know it!
|
[20:01] cremes
|
:)
|
[20:02] pieterh
|
cremes: :-)
|
[20:03] cremes
|
pieterh: i'm wondering if i can help you with a C repro of this: https://github.com/zeromq/zeromq2/issues#issue/171
|
[20:03] cremes
|
it's preventing my whaleshark from running for more than about 6 hours before exhausting memory
|
[20:03] cremes
|
:\
|
[20:04] pieterh
|
when you say, "if I can help you", you mean "if you can help me"?
|
[20:05] cremes
|
um, yeah!
|
[20:05] cremes
|
and by extension, the whole universe of 0mq users!
|
[20:05] pieterh
|
of course, of course...
|
[20:05] cremes
|
heh
|
[20:05] pieterh
|
i'm mentally filtering out all that reactor stuff...
|
[20:06] CIA-21
|
jzmq: 03Alois BÄlaÅ¡ka 07master * r013828f 10/ (7 files in 3 dirs): Merge remote branch 'orig/master' - http://bit.ly/eaOuPN
|
[20:06] CIA-21
|
jzmq: 03Alois BÄlaÅ¡ka 07master * r086899d 10/ (2 files): Poll timeout in ZMQQueue and ZMQForwarder changed from 250 usec to 250 msec. - http://bit.ly/dP4yAQ
|
[20:06] CIA-21
|
jzmq: 03Alois BÄlaÅ¡ka 07master * r522d747 10/ (2 files): Fixed termination of ZMQForwarder and ZMQQueue tests. - http://bit.ly/dOWVmj
|
[20:06] cremes
|
pieterh: the 3 steps i list are pretty much the meat of the issue
|
[20:06] pieterh
|
ok, tomorrow, I want to catch a movie this evening
|
[20:06] cremes
|
i repro'ed it using my reactor since all the threading stuff is abstracted away for me
|
[20:06] cremes
|
pieterh: no worries...
|
[20:06] mikko
|
cremes: are you sure that this is not ruby gc issue?
|
[20:06] pieterh
|
I'll make the case you explained, we'll see if it reproduces the problem or not
|
[20:08] dijix
|
I've got something that should be simple giving me problems - something I usually take to mean I have a fundamental misunderstanding.
|
[20:08] mikko
|
maybe not
|
[20:09] dijix
|
I can set up a REQ/REP - REQ connects to bound REP. I can send from REQ, receive at REP and send back to REQ - no problems.
|
[20:10] dijix
|
Changing to REQ/XREP - I get a three-part message at XREP from the same simple Send from REQ - but I cannot seem to send anything at all back to the REQ. even if on that side, I RecvAll, I cannot seem to get anything.
|
[20:11] cremes
|
mikko: i am nearly positive; this behavior is the same across 3 different ruby runtimes each with completely different GC implementations
|
[20:12] cremes
|
also, the ruby heap isn't growing; top will show the rsize as (for example) 1GB but dumping the ruby heap
|
[20:12] cremes
|
only produces around 100MB; where is the other 900MB?
|
[20:13] cremes
|
but the only way to know for sure is to repro in C
|
[20:14] cremes
|
dijix: read chapter 3 (advanced request-reply) in the guide
|
[20:15] cremes
|
dijix: it will explain how to deal with the xrep envelope and how to properly return a reply with it
|
[20:16] mikko
|
cremes: forwarder device is the one that is problematic?
|
[20:20] cremes
|
mikko: no, i don't think it's specific to the device
|
[20:20] cremes
|
but using one is the easiest way to isolate the problem and see the memory growth
|
[20:20] dijix
|
cremes: Thanks - I was trying to figure it out from info in Chap 2.
|
[20:21] cremes
|
it appears to me that some resource isn't freed when there are rapid connect/disconnect operations going on
|
[20:21] cremes
|
dijix: you are welcome; once you figure the envelope/framing out, it's actually pretty cool
|
[20:22] pieterh
|
cremes: I'll do that tomorrow, np.
|
[20:23] pieterh
|
cyat, /me is off
|
[20:23] mikko
|
pieterh: enjoy
|
[20:23] pieterh
|
mikko: thanks :-)
|
[20:23] cremes
|
enjoy your movie!
|
[20:56] ljackson
|
anyone know if many threads using push socket each with HWM and SWAP is a valid use case ?
|
[20:57] ljackson
|
appears that (swap) it is not thread safe ....
|
[20:57] ljackson
|
2.1.1
|
[20:58] mikko
|
ljackson: you mean threads concurrently accessing push socket?
|
[20:59] ljackson
|
no one push socket per thread
|
[20:59] ljackson
|
and having HWM and SWAP enabled on them
|
[21:00] ljackson
|
appears to have a segfault as the context is sharing swap for the threads without locking i think
|
[21:00] mikko
|
i thought there was atomic counter for swap instances
|
[21:01] mikko
|
each socket having their own
|
[21:01] mikko
|
have you got a small reproducable test-case?
|
[21:01] ljackson
|
http://pastebin.com/JV1PPyt0
|
[21:01] ljackson
|
no
|
[21:01] ljackson
|
just ran across it and wanted to know if it was valid to have swap/hwm per socket in threads w/ single contet
|
[21:01] ljackson
|
er context
|
[21:01] ljackson
|
for PUSH
|
[21:03] ljackson
|
there were 4 threads each with their own PUSH socket to the same tcp address all with HWM of 10k and SWAP of 25Mb
|
[21:03] ljackson
|
and one of them caused that segfault
|
[21:04] ljackson
|
yep looks like they create 4 swap files
|
[21:04] ljackson
|
so odd that it segfaulted
|
[21:05] mikko
|
i mean the actual code
|
[21:06] ljackson
|
yeah If I can figure out how to reproduce I will create a test bit of code
|
[21:06] mikko
|
cool
|
[21:06] ljackson
|
just wanted to know if the use case was valid now I see the 4 swap files must be something odd/else
|
[21:06] mikko
|
can you strace the process ?
|
[21:06] mikko
|
to see what kind of swap files are being created
|
[21:06] ljackson
|
i see them in the directory
|
[21:06] mikko
|
it should be valid case
|
[21:06] ljackson
|
as .swap
|
[21:07] ljackson
|
with zmq_<master pid>_<thread_id>.swap
|
[21:07] ljackson
|
er s/thread_id/socket instance/
|
[21:07] ljackson
|
is there a socketopt to specifiy where the swap file would be ?
|
[21:08] mikko
|
no, not at the moment
|
[21:09] mikko
|
you can for example chdir("/tmp"); at the beginning
|
[21:11] ljackson
|
ok fair nuf
|
[21:11] ljackson
|
what about mmap'ed vs file
|
[21:13] ljackson
|
it appears to pause the HWM/SWAP'ed socket's send method after it reaches the max, this must be dependant on the socket type ? e.g. pub would discard where as res/req & push/pull ...etc won't ?
|
[21:14] mikko
|
yes, the hwm behaviour is socket dependent
|
[21:14] mikko
|
we have had a discussions about swap
|
[21:15] mikko
|
i think we came to a conclusion that speed at that point is not the primary concern
|
[21:16] mikko
|
as initially you buffer into memory, then into disc as secondary thing
|
[21:18] dijix
|
cremes: I've got the XREP message envelopes figured out - thanks for your help
|
[21:19] cremes
|
dijix: glad to hear it!
|
[21:25] dijix
|
cremes: Is there any other difference between using REQ/XREP versus PAIR/XREP, other than the empty message portion between the payload and the address envelopes?
|
[21:26] cremes
|
PAIR is for point-to-point communications; it should *only* be used with another PAIR socket
|
[21:26] cremes
|
therefore PAIR/XREP is illegal
|
[21:26] cremes
|
though the lib won't complain
|
[21:27] dijix
|
true - and it seems to behave
|
[21:51] dijix
|
cremes: Strange - for REQ/XREP - if I set the Identity explicitly on the REQ, the response works, but if I allow it to generate a UUID, it doesn't. on the XREP side, I'm breaking down the RecvAll into three parts, the UUID, empty and payload. Then on the response, I'm contructing a three-part reply using that UUID, and empty message, and the response message.
|
[22:47] dijix
|
cremes: Did you see that comment from me just before the netsplit? regarding explicitly assigning the IDENTITY on the REQ socket before sending to XREP? If I set it, the return message I contruct using it works, but not if I use the UUID created automatically
|
[22:48] dg
|
Hello --- can someone confirm that zmq tcp sockets will interoperate with ordinary tcp sockets?
|
[22:48] cremes
|
dijix: nope, didn't see the comment
|
[22:48] cremes
|
dg_: no, they won't; 0mq sockets are built on top of tcp sockets but use their own framing protocol
|
[22:49] dg
|
Well, pants.
|
[22:49] cremes
|
your tcp socket at the other end would have to understand this framing protocol (effectively duplicating 0mq logic)
|
[22:49] cremes
|
dg_: take a look at mongrel2
|
[22:49] cremes
|
it can act as a proxy between 0mq and http; perhaps it does the same for tcp
|
[22:50] cremes
|
dijix: that doesn't make any sense
|
[22:50] cremes
|
you should gist/pastie your example code that illustrates your problem
|
[22:50] dg
|
TBH I think I can work with zmq anyway --- but does anyone know a telnet/netcat equivalent I can use to connect to my service so I can send it test data?
|
[22:51] cremes
|
dg_: i haven't heard of one; maybe that's a good first project to get your feet wet with 0mq
|
[22:52] dijix
|
cremes: I know. Okay - REQ/XREP. REQ sets IDENTITY to a value and just sends a simple text message. XREP receives three-part message with the designated IDENTITY value in the first part.
|
[22:52] cremes
|
good so far
|
[22:52] dg
|
Yay. Sigh.
|
[22:52] dg
|
Well, I need a client to talk to my server anyway, so I might as well as do that bit now. Thanks anyway.
|
[22:52] cremes
|
dg_: try asking on the mailing list if such a tool exists; it might and i just don't know about it
|
[22:52] dijix
|
cremes: Then XREP contructs new message using that value for the first part, empty, and a response. It is successfully received by REQ.
|
[22:53] cremes
|
ok
|
[22:53] cremes
|
dijix: take a look here: http://api.zero.mq/master:zmq-setsockopt
|
[22:54] dijix
|
cremes: However, if I do not explicitly set the IDENTITY, I still receive a three-parter on XREP, with the auto-gen UUID for the first part. If I follow everything else and use that UUID for the first part on the response message, REQ never receives it.
|
[22:54] cremes
|
note that the system-set IDENTITY starts with a null byte; perhaps you aren't copying the whole IDENTITY?
|
[22:55] dijix
|
cremes: Could be - I'll mess with the Encoding
|
[22:55] cremes
|
dijix: show the code; just put it up in a pastie; maybe i'll see something you are overlooking
|
[22:57] dg
|
Does anyone know about the Lua bindings?
|
[23:00] dg
|
Um, is zmq 2.0.6 really ancient?
|
[23:00] cremes
|
dg_: it's pretty old; use the just-released 2.1.1
|
[23:00] dg
|
Unfortunately that's what Ubuntu's got.
|
[23:01] cremes
|
well, few people are going to want to help a newcomer work through the old and mostly fixed bugs from 2.0.6 :P
|
[23:01] cremes
|
so i recommend you dload the source and compile it yourself
|
[23:07] dg
|
Strewth. Even Debian sid's version is only 2.0.10.
|
[23:20] dg
|
What does the thread pool argument to zmq_init() actually mean? How do I know what to set it to?
|
[23:21] dg
|
The examples all seem to use 1; is this safe for a single-threaded program?
|
[23:23] cremes
|
dg_: it's a dedicated i/o thread for reading/writing to the network
|
[23:23] cremes
|
it's usually set to 1 because unless you are pumping out data onto 10GB ethernet it is unlikely you will saturate one core/cpu with i/o
|
[23:24] dg
|
Right. Just checking it wasn't doing something evil like using blocking I/O with one thread per concurrent operation.
|
[23:28] dijix
|
cremes: Got it - I was converting everythign to Encoding.ASCII, and then also using that encoding for contructing the byte arrays for SendMore.. I just have to grab the first portion as a byte array and not touch it. It worked before because what I was setting the IDENTITY to didn't have anything that wasn't ASCII.
|
[23:28] cremes
|
dijix: ah ha
|
[23:28] cremes
|
or is that "aha"?
|
[23:30] dijix
|
cremes: I think it's time to leave the office, carrying this day's 0mq successes proudly :)
|
[23:30] cremes
|
dijix: huzzah!
|
[23:30] amacleod
|
When you connect dealer to dealer (XREQ to XREQ) is there any way for the XREQ that used "bind" to send a reply?
|
[23:30] cremes
|
amacleod: why would you connect XREQ to XREQ?
|
[23:30] dijix
|
cremes: Though why .NET won't take bytes convert them to a string, and then convert that back to bytes using the same encoding, and end up with the original byte array is a question for another day.. :/
|
[23:31] amacleod
|
http://zguide.zeromq.org/chapter:all#toc51
|
[23:31] cremes
|
amacleod: that shows two XREQ/DEALER clients talking to a single XREP/ROUTER server
|
[23:31] cremes
|
they aren't talking to each other directly
|
[23:32] amacleod
|
cremes, further down it shows inproc XREQ to XREQ.
|
[23:32] cremes
|
amacleod: i don't see it
|
[23:32] amacleod
|
Figure 47?
|
[23:33] amacleod
|
"Detail of async server".
|
[23:33] cremes
|
huh... i don't know then
|
[23:34] cremes
|
i've never seen that... you'll have to ask pieterh tomorrow or post a question to the ML
|
[23:37] dg
|
Woot. I'm now bouncing a message off my server and getting a reply.
|
[23:37] dg
|
Thanks!
|
[23:38] dg
|
One additional question: if I'm sending a multipart message with ZMQ_SNDMORE, is every part copied and stored before being sent, or are they sent out as I produce them?
|
[23:38] dg
|
e.g. my app's messages are actually very long lists of small strings. Like 5000. Can I send each one as a single part or should I buffer them up into a single block?
|
[23:51] cremes
|
dg_: 0mq is a message queue; just send 'em and it will do the buffering for you
|