ZeroMq IRC Log

Friday July 29, 2011

[Time] Name	Message
[02:14] stelcheck	has anyone worked with ZeroMQ to do a Push/Pull pipeline structure?
[02:17] stelcheck	i am wondering if there is a way to set the socket on the pushing end to skip to the next server if one of the connection is broken instead of stacking it on the dead queue
[02:21] stelcheck	in other words, I am looking to make the queue unique between multiple connections, instead of having one queue per connect, so that whatever is in the queue will be distributed to the existing connections
[02:36] stelcheck	ah, nm
[02:37] stelcheck	i was inverting client and server position
[02:37] stelcheck	so the queuing was inverted as well
[03:04] stelcheck	hmmm, ok, maybe another question then... how can you recover from a server crash in a rep/req request? I.E., if the server does not send a response to the client, the client stops sending its requests because it did not receive a response
[06:04] guido_g	stelcheck: close the socket
[07:38] CIA-32	libzmq: 03Martin Sustrik 07vtcp * rd5f3628 10/ (8 files): Different connecters simplified ...
[08:22] wadimgrasza	hi all, I have a question about the code examples in the zeromq guide: In examples/C/mdwrkapi.c:237: why is the socket recreated on the application level? Won't zeromq try to reestablish the connection by itself?
[08:33] fredix	hi
[08:34] fredix	is it possible to create a ZMQ_QUEUE with a ZMQ_PULL and a ZMQ_XREQ ?
[08:52] mikko	hi
[08:52] mikko	fredix: you mean a device?
[08:53] mikko	wadimgrasza: can you link me to the file?
[08:55] pieterh	wadimgrasza: hi, you still here?
[08:55] wadimgrasza	pieterh: yes
[08:56] pieterh	so to understand that pattern you have to have read the previous ones
[08:56] wadimgrasza	mikko: https://github.com/imatix/zguide/blob/master/examples/C/mdwrkapi.c#L237
[08:56] pieterh	this is basically the 'lazy pirate' pattern
[08:58] pieterh	wadimgrasza: so, in general, yes, 0MQ will auto-reconnect for you
[08:59] wadimgrasza	pieterh: I think I understand what the example is about, it's just that one thing that I don't get: why is the reconnecting performed on the application level?
[08:59] pieterh	in this particular case, so that the app can send a new READY message
[08:59] fredix	mikko: yes
[08:59] pieterh	that is, the connection is stateful, and the app wants to be in control of that
[09:00] pieterh	there are alternatives, which I developed later
[09:00] pieterh	for example if the server doesn't recognize the worker, it can challenge it, and the worker then responds
[09:01] pieterh	but basically, just reconnecting by itself isn't sufficient here
[09:02] pieterh	wadimgrasza: does this make sense?
[09:04] wadimgrasza	pieterh: ok, so you're saying that the worker is manually reconnecting because it doesn't want to send another READY message on the (possibly) same connection
[09:05] pieterh	the worker is manually reconnecting because the MDP spec doesn't provide any way for the broker and worker to hand-shake a new connection otherwise
[09:06] pieterh	manually reconnecting is the simplest way to ensure the two parties agree on state
[09:07] pieterh	note that the mdp worker api is based on the ppworker api, which is explained a little earlier in the doc
[09:07] pieterh	the text says, "The worker uses a reconnect strategy similar to the one we designed for the Lazy Pirate client."
[09:07] pieterh	maybe I'll explain that more in the text
[09:08] pieterh	it's kind of the same problem at the client side, except there it's also due to REQ sockets and their statefulness
[09:10] pieterh	wadimgrasza: it helps to understand if you look at spworker and ppworker
[09:17] wadimgrasza	pieterh: your explanation pretty much makes it clear, thanks! My question arose from wishing for a different behaviour, which basically means a different protocol. I didn't take into accout the limitations of this protocol.
[09:25] pieterh	wadimgrasza: the protocol is not made in stone, of course: you could change it
[09:26] pieterh	btw thanks for your pull request, I'm checking that now...
[09:30] pieterh	wadimgrasza: can you provide me your email address, for the contributors list for the Guide?
[09:38] wadimgrasza	pieterh: sure
[09:38] pieterh	wadimgrasza: thanks, and nice work fixing those MDP brokers btw
[09:39] mikko	fredix: i think you can
[09:39] mikko	fredix: but notice that devices have been removed in 3
[09:40] pieterh	mikko: indeed, but we can put zmq_device() back if we want it
[09:40] pieterh	though no-one really seems to care, so it's been dropped
[09:42] mikko	pieterh: i reimplemented devices in the php extension code
[09:42] pieterh	right
[09:43] pieterh	tbh I've never used devices in anything realistic
[09:43] pieterh	I mean, the simple devices
[09:43] mikko	there are uses for the simple devices as well
[09:43] mikko	i've been using them in past for threaded programs
[09:43] pieterh	the device pattern is of course very useful but it always seems to end up wrapped in a little more intelligence
[09:43] mikko	for a single entry-point out
[09:43] pieterh	for sure
[09:44] mikko	very simple use-case but makes it easier when devices are bundled
[09:44] pieterh	and if they're bundled in libzmq they're faster too
[09:44] pieterh	that's the use case but I've not seen vocal demand for it
[09:45] mikko	a company that i know uses them for similar use-case
[09:45] mikko	they fork out worker processes
[09:45] mikko	they are not very vocal in the community
[09:46] mikko	what i am worried about at the moment is that what should be recommended to users?
[09:46] mikko	2.1 ?
[09:46] mikko	2.2?
[09:46] mikko	3.0?
[09:46] pieterh	it depends on what they need
[09:46] mikko	if people build on 2.x series, three will break a lot of things
[09:46] pieterh	I'd personally use 2.1
[09:46] mikko	and libzmq master is again incompatible with 3.x
[09:46] pieterh	because it is stable
[09:46] mikko	which means that similar thing is very close to happen with 4.x
[09:46] pieterh	and I'd be careful about not using stuff I know will disappear
[09:46] pieterh	e.g. durable sockets, devices, swap
[09:47] mikko	but users wont know that
[09:47] pieterh	replacing that over time, if I use that
[09:47] pieterh	hmm
[09:47] mikko	similar thing to this happened with imagemagick
[09:47] pieterh	I've been removing these things from the guide
[09:47] mikko	their api kept breaking and people got frustrated
[09:47] pieterh	this probably deserves an explanation in the guide
[09:47] mikko	so they made a new api called "magickwand" which is the stable api
[09:47] pieterh	:)
[09:47] mikko	and their "core api" is all bets off
[09:47] pieterh	well, we have that with language bindings
[09:48] pieterh	e.g. if you use CZMQ, it hides the core API
[09:48] mikko	but the problem is that we are exposing the changes to bindings maintainers
[09:48] mikko	it's not trouble-free upgrade
[09:48] pieterh	indeed
[09:48] mikko	and especially when all bindings are using the zmq api
[09:48] mikko	probably main reason being when they were written
[09:49] mikko	like for example the change from zmq_send to zmq_sendmsg is something i don't fully comprehend myself
[09:49] pieterh	my feeling is that for some projects I want stability, for others I want new functionality
[09:49] mikko	sure, but investing on 2.x series at the moment is a really bad investment
[09:49] mikko	especially for new code
[09:50] pieterh	well, I don't really think so
[09:50] mikko	and investing on 3.x isn't any better as the new master branch is already incompatible
[09:50] pieterh	in fact, 3.x and 4.x are purely experimental
[09:50] mikko	but 3.0 was released as 'stable'
[09:50] pieterh	nope
[09:50] pieterh	it was unstable, alpha
[09:51] pieterh	"Unstable release 3.0"
[09:51] mikko	but the versioning doesn't indicate that, so for many people 3.0.0 is stable release
[09:51] mikko	3.0.0-alpha1 would be alpha for many people
[09:51] pieterh	?
[09:51] mikko	at least that is what i am picking up
[09:52] pieterh	? the page is really explicit
[09:52] mikko	or 3.0.0-rc1 for release candidate
[09:52] pieterh	there is no tie between major/minor/patch numbering and stability
[09:52] pieterh	none whatsoever, and there never was
[09:53] pieterh	I wanted to make that, originally, was told "no way"
[09:53] pieterh	version numbering are 100% orthogonal to stability of the package
[09:53] pieterh	they indicate compatibility
[09:53] pieterh	the patch level, to some extent, indicates maturity
[09:54] pieterh	note that I don't particularly like this
[09:54] pieterh	adding 'rc1' or 'alpha' is a good suggestion
[09:54] mikko	i think the problem for a user is the disconnection between tar package name and stability level
[09:54] pieterh	i agree
[09:54] mikko	if i acquire zeromq-3.0.0.tar.gz from somewhere it doesn't carry the unstabledness with it
[09:55] pieterh	ok, so how do you want to solve this?
[09:55] pieterh	i'm more than happy to add suffixes
[09:55] pieterh	but let's be consistent then
[09:55] mikko	previously i've released several versions of same version
[09:55] pieterh	I don't do that
[09:56] pieterh	version number identifies the release uniquely
[09:56] pieterh	what we can do is add a suffix
[09:56] mikko	http://pecl.php.net/package/imagick
[09:56] pieterh	-unstable
[09:56] pieterh	-stable
[09:56] mikko	for example here
[09:56] pieterh	-legacy
[09:56] pieterh	-experimental
[09:57] pieterh	-rc
[09:57] pieterh	I used to use version numbers like 1.0a1, 1.0b2, to solve this
[09:58] pieterh	apparently that was not standard and totally confusing
[10:00] pieterh	anyhow, our development process does not match that of imagick
[10:00] pieterh	we release code then patch it into maturity
[10:01] pieterh	so e.g. we go from 2.1.7 to 2.1.8, not 2.2.0
[10:01] pieterh	2.2.0 would indicate a core change of some kind
[10:28] ianbarber	genuine question: how are we signalling subscriber durability in 4?
[10:28] ianbarber	is that just dropped along with manual IDs? or am i forgetting something obvious
[10:28] pieterh	ianbarber: it is dropped
[10:28] pieterh	this is actually the core of why explicit identities are such a pain to implement
[10:28] ianbarber	yeah
[10:29] pieterh	I've not looked at it, but my hypothesis is that subscriber durability can be better done using a device/broker
[10:30] pieterh	this seems to be a problem that was solved in the wrong layer
[10:30] pieterh	ianbarber: now that you're here what do you think of adding maturity indicators to packages?
[10:31] pieterh	mikko points out that '3.0.0' is deceptive, looks stable
[10:31] ianbarber	yeah, i agree on that one
[10:31] ianbarber	i think the process we have for 2 is pretty good, as you've said people wont use it otherwise
[10:31] ianbarber	but 3 is somewhat different
[10:32] pieterh	well, 3 is going to go the same way as 2
[10:32] ianbarber	because 2.x is always pretty much stable, the issues are usually minor if there are any
[10:32] ianbarber	but 3 is a big change, and it is significantly less tested
[10:32] pieterh	it took a while for 2.1 to become stable
[10:32] ianbarber	i've spoken to a few people that are talking about moving their infrastructures to 3
[10:32] ianbarber	which worries me somewhat, as they're not putting the time in i think is warranted
[10:32] pieterh	sure
[10:32] pieterh	this is ok
[10:33] ianbarber	yeah true, but because 2.0.8 was so well distributed, a lot of people took a long time to move to 2.1
[10:33] ianbarber	which all mostly worked out
[10:33] pieterh	we eventually had to force people to move to 2.1 just to get it stable
[10:33] pieterh	you know that the code base is in fact heavily heavily tested already
[10:33] ianbarber	see, the thing is for me that 2.0 was broken in some pretty fundamental ways
[10:33] pieterh	right, it was
[10:34] ianbarber	2.1+ does a job very well, there are some decisions that could be made differently, but i think 3 and so on are an expression of a refined, slightly different idea
[10:34] pieterh	afaics 3.0 is attractive for those who need subscription forwarding
[10:34] pieterh	4.0 will be attractive for its new router socket
[10:34] ianbarber	basically, my concern is that people will choose 3 instead of 2.1.8, when in fact they'd be better served with 2.1.8
[10:34] ianbarber	agreed
[10:34] pieterh	ok
[10:35] pieterh	so we attach maturity tags to all packages
[10:35] pieterh	if people use 3.0, we get error reports and it moves to stability
[10:35] pieterh	this is all good
[10:35] pieterh	but we have to be transparent about the risk
[10:35] ianbarber	yep
[10:36] pieterh	do we use -alpha / -beta / -rc / -stable etc.?
[10:36] pieterh	note that releases are already numbered so no rc1/rc2 etc.
[10:37] ianbarber	my inclination would be something very simple, alpha, beta and nothing for stable, and that should probably attach to branch
[10:37] ianbarber	2.1, 2.2-beta, 3.0-alpha
[10:37] ianbarber	(this is just off the top of my head now though, so not the worlds most considered opinion)
[10:38] pieterh	I'd prefer not changing the branches
[10:38] pieterh	this is just a file name tag
[10:38] pieterh	ok, -alpha, -beta, -rc
[10:39] pieterh	ianbarber: random question, do you want to present 0MQ at Software Freedom Day in Amsterdam, on 12th September?
[10:39] pieterh	s/12/14/
[10:39] pieterh	I was going to do that (Amsterdam!) but will probably be in Texas
[10:41] ianbarber	ah, well texas sounds fun too :) lemme check my calendar
[10:45] ianbarber	hmm, that wednesday does not look good. i'll see if I can shuffle anything,
[10:51] pieterh	mikko: ok... I've tagged all alpha / beta releases since 2.0.0
[10:51] pieterh	also changed the main download page to make it more obvious
[10:53] mikko	cool
[10:53] mikko	thanks
[10:53] pieterh	np, this is nicer
[10:53] pieterh	I just added -alpha, -beta, or -rc to the package name
[10:54] pieterh	and will continue to do this for future releases
[11:04] pieterh	mikko: do you want to discuss the plan for 2.x->4.x migration?
[11:09] mikko	pieterh: sure, should we start an email thread?
[11:09] mikko	more async than irc
[11:09] pieterh	mikko: how about a wiki page with discussion on the list?
[11:10] pieterh	I think we need a documented plan
[11:11] pieterh	i'll throw something together to kick off discussion
[11:11] mikko	yes
[11:11] mikko	makes sense
[11:11] mikko	im on wireless with 4000ms latency so irc is not the most pleasant
[11:11] pieterh	ok, I'll email the list when it's there, an hour or two
[11:11] pieterh	:-)
[12:03] pieterh	mikko: ok, it's there: http://www.zeromq.org/topics:planning
[12:17] Seta00	has anyone here successfully build zeromq with clang?
[12:18] Seta00	did*
[12:22] pieterh	Seta00: did you check the list and google?
[12:23] Seta00	pieterh: heh, what I'm doing is rather unusual, I'm cross-compiling for 32bit on clang
[12:23] Seta00	but yeah, I've checked
[12:24] Seta00	it builds fine for the default architecture (x86_64)
[12:24] pieterh	mikko is the genius in this department
[12:24] Seta00	I think it's a clang bug
[12:24] Seta00	ld: bad codegen, pointer diff in __ZN3zmq4lb_tD2Ev to global weak symbol __ZTVN3zmq15i_writer_eventsE for architecture i386
[12:25] pieterh	ah, the infamous __ZN3zmq4lb_tD2Ev bug
[12:26] Seta00	hah
[12:34] cremes	Seta00: there are 2 threads covering that error on stackoverflow; apparently the fix is to set "symbols hidden by default"
[12:34] cremes	to yes in the build settings (for xcode)
[12:34] cremes	but that must be passed down to clang somehow
[12:36] Seta00	really?
[12:36] Seta00	do you happen to have a link?
[12:36] cremes	yeah, i can't easily find the equivalent cli switch
[12:36] cremes	hold on...
[12:36] cremes	http://stackoverflow.com/questions/5285844/bad-codegen-pointer-diff-linker-error-with-xcode-4
[12:36] cremes	http://stackoverflow.com/questions/6087292/bad-codegen-pointer-diff-in-boost-error-in-32-bit-build
[12:37] cremes	that second one might be more on point since you are cross compiling
[12:37] Seta00	oh, I included zeromq in the search, that's why those didnt come up
[12:38] cremes	my google-fu is apparently stronger than yours ;)
[12:38] Seta00	oh
[12:38] Seta00	it's like on GCC
[12:38] Seta00	-fvisibility=hidden
[12:41] Seta00	$ lipo -info src/.libs/libzmq.dylib
[12:41] Seta00	Non-fat file: src/.libs/libzmq.dylib is architecture: i386
[12:41] Seta00	cremes: many thanks ;)
[12:41] cremes	you're welcome
[14:19] xyzzy	So, still using the c# windows binding, I'm running into an error that crashes my server 'Assertion failed: fds.size () <= FD_SETSIZE (..\..\..\SRC\SELECT.CPP:70) I've searched on this, but, not found much in way of how not to incur the wrath of this error.
[14:19] xyzzy	Any suggestions?
[15:40] sustrik	xyzzy_: you have to redefine FD_SETSIZE
[15:47] xyzzy	sustrik, is there an easy way for a newb to do that on windows?
[15:48] xyzzy	As I understand what I'm reading, it means rebuilding the libzmq.dll from source, but, that's only as I understand it.
[15:48] xyzzy	I'm prone to misunderstanding :P
[15:50] sustrik	well, open the MSVC solution
[15:50] sustrik	open properties dialog box
[15:50] sustrik	find where custom build options are being set
[15:50] sustrik	modify the value for FD_SETSIZE
[15:56] xyzzy	Hm. I suspect this isn't as easy as it's being explained, but, I will see what happens after muddlign my way through a bit. :P
[16:00] xyzzy	Well, okay, maybe it is... just have to hund down the build options bit.
[16:00] xyzzy	Was suspecting there was gonna be a ton of issue getting it to load the project in VS2010. :P
[16:02] sustrik	i guess it's somewhere near the bottom of the dialog box
[16:03] sustrik	not 100% sure though
[16:08] xyzzy	Well, that's the normal properties dialog for stuff, but, I'm not seeing anything that sounds like it relates to FD_SETSIZE. It looks like just the path to the file and such.
[16:14] sustrik	well, i am not sure how exactly msvc2010 property dialog box looks like
[16:15] sustrik	however, in the project file the value is set here:
[16:15] xyzzy	Hmmm... In the source I have the problem code doesn't appear to be there... I wonder if I'm running into this issue cause I have an outdated libzmq.dll I shall try running my project with the self-compiled version from latest source and see if things blow up horrible.
[16:15] sustrik	https://github.com/zeromq/zeromq2-1/blob/master/builds/msvc/libzmq/libzmq.vcproj#L43
[16:15] sustrik	the error is actually a windows limitation
[16:15] sustrik	you can't poll on more than FD_SETSIZE file descriptors
[16:16] sustrik	which limits the number of connections you can handle in parallel
[16:19] xyzzy	Yeah, i kinda gathered that, and albeit I didn't find the solution i expected. Recompiling from source instead of using the libzmq binary included with the c# runtime, appears to have fixed the issue. :)
[16:19] xyzzy	So thanks. :)
[16:21] xyzzy	The old dll must have been set extremely small.
[16:21] xyzzy	I'm guessing the minimum 64.
[16:22] xyzzy	Anyways, I know I already said it, but, thanks, this has been bugging me and I'm super anxious about even asking questions here.
[16:36] sustrik	you are welcome :)
[16:42] ianbarber	sustrik: zed was asking on twitter about durability on pub/sub in 4
[16:43] ianbarber	i was about to say it's not there, but reading the code i'm not so sure
[16:43] ianbarber	afaics, durability was basically down to the fact that a socket with an explicit identity was basically a reconnection
[16:44] ianbarber	the identity just flagged how a detected disconnect shoudl effect messages
[16:44] ianbarber	is that functionality still in there, or has it been removed?
[16:50] ianbarber	i'm guessing it was in the session code that went from socket_base, so no
[16:52] ianbarber	zedas: i am reasonably confident that durability isn't in 4.0, though I think sustrik would know for sure. Not sure you need it though, unless mongrel2 will remember the connection IDs on restart - there only issue is handler -> m2 as far as I can see
[16:54] ianbarber	must admit i do like how much tidier the classes are in 4 :)
[17:28] sustrik	ianbarber: without identities there's no way to identify specific client
[17:28] sustrik	so, when the client connects, the server has no idea which messages were already received and which were not
[17:30] pieterh	ianbarber: I think the conclusion was that the new ROUTER sockets can effectively replace identities for most patterns
[17:30] ianbarber	oh yeah, ROUTER is fine
[17:30] ianbarber	this was mainly for the PUB SUB durable case
[17:30] pieterh	4.0 router is magic stuff
[17:31] pieterh	for durable pubsub, I'd consider something like the clone pattern
[17:31] ianbarber	is there actually a plan for how it will work, or is it literally magic stuff
[17:31] pieterh	ianbarber: well, there was very brief discussion, no written proposal, and 30 minutes later Martin announced that it was working :-)
[17:31] pieterh	so, yes, magic
[17:32] ianbarber	hehe
[17:33] ianbarber	sustrik: so will any TCP interuption that causes reconnect always be treated as new connection
[17:33] ianbarber	i was under the impression the identity was still sent, so that case should still work as before
[17:34] ianbarber	but in a case of subscriber dying and generating a new identity it's side, it'll have missed some
[17:35] sustrik	yes, it will be treated as a new connection
[17:35] sustrik	however, keep in mind that TCP timeout is 2hrs
[17:35] sustrik	if it exists at all
[17:36] sustrik	so, tcp connection failure is almost always caused by application dying
[17:36] ianbarber	fair point
[17:38] ianbarber	the only concern I would have is wireless devices - lets say i have a subscriber over mobile network or similar, i would always have to explicitly add a sequence number and track receiving in case my connection dropped and was reestablished, as there is no way at application level of detecting that the tcp connection dropped and reconnected
[17:39] zedas	pieterh: so the answer is "everyone should just use ROUTER and go back to tracking clients manually just like with sockets"?
[17:39] zedas	part of the advantage of zeromq was that simple servers didn't have to do their own client tracking.
[17:39] pieterh	zedas: ROUTER lets you build custom patterns
[17:39] zedas	and, if that's the design direction, then zeromq needs to add a client disconnect/reconnect callback system.
[17:39] ianbarber	zedas: you still don't. PUSH/PULL is your client comms, you just need to have a listening socket for incoming.
[17:39] pieterh	yes, that's what ROUTER adds in 4.0
[17:40] pieterh	you get what approaches TCP style sockets
[17:40] pieterh	except all the 0MQ stuff as well
[17:40] zedas	without callbacks indicating connect/disconnect you have to use crappy timeouts to keep your state storage clean.
[17:40] pieterh	except message distribution
[17:41] zedas	so, 4.0 will have ROUTER sockets plus connect/disconnect messages AND it'll continue to work without needing to run everything through zmq_poll (i.e. I can use my own epoll or kqueue)?
[17:41] ianbarber	zedas: why would you need to track connects/disconnect at all?
[17:41] zedas	ianbarber: if i have to keep track of client connections in a ROUTER, then I have to setup internal state, and if I don't know when the clients are gone or come back then I can't manage that state.
[17:42] pieterh	zedas: indeed
[17:42] pieterh	we do this now using heartbeating
[17:42] pieterh	it's doable but somewhat extra work
[17:42] ianbarber	zedas: i don't think you do though, for your use case, unless i've misunderstood it
[17:42] pieterh	by exposing connection/disconnection events, that work disappears
[17:42] ianbarber	you only need router on the application handler side
[17:43] zedas	pieterh: same here, so i end up with this janky hashmap+handrolledGCusingtimeouts
[17:43] ianbarber	ah, i see, but the dropping would not replicate the current replies
[17:44] zedas	ianbarber: ok lemme break it down real quick: with a ROUTER, you get a message and you get some "client id" right? you use this ID to talk to the client.
[17:44] zedas	ok, now i keep those in some hashmap.
[17:44] zedas	next, zeromq gets rid of persistent IDs, so each client is making new ones.
[17:44] zedas	finally, i get a client that connects and disconnects a ton of times.
[17:45] pieterh	zedas: how much of the Guide have you read?
[17:45] zedas	my hashmap now fills will useless client IDs until I get around to doing a heartbeat or a timeout.
[17:45] zedas	if i've got disconnects, i can clean up immediately.
[17:45] zedas	pieterh: i read it a while ago, and once again when i was trying to figure out how clientids changed.
[17:46] zedas	remember i was telling you the description for IDs didn't make sense?
[17:46] ianbarber	zedas: that sounds like you're trying to do what you currently do with PULL and PUB in one socket though - I'd keep that.
[17:46] ianbarber	sustrik: Does ROUTER drop if it can't route, or block/fail?
[17:47] zedas	ianbarber: uh, no it's a server that's talking to a client and needs the id to talk to it.
[17:47] zedas	i mean, what are you guys doing? just entirely stateless and never storing client IDs for later operations?
[17:47] ianbarber	zedas: what I would have: m2 PUSH to any handler, including a DSN. Handler creates REQ socket, stores in hash, sends reply to that directly. If the server has disappeared, handler can decide what to do.
[17:48] ianbarber	M2 then just has a PUSH socket and a XREP for getting replies, otherwise exactly the same
[17:48] zedas	well i'm not talking about mongrel2 here, i'm talking about other servers I've written with XREP
[17:48] ianbarber	ah
[17:48] zedas	every one ends up having to do a heartbeat/timeout/GC thing to keep the state clean.
[17:48] ianbarber	again though, like if you're using a hash, just use multiple sockets
[17:49] ianbarber	they're not massively expensive
[17:49] ianbarber	and you can just clean up when they fail to send
[17:49] pieterh	ianbarber: multiple sockets don't work that well
[17:49] pieterh	but iirc sustrik made or will make this work like TCP in 4.0
[17:49] pieterh	so you actually get a socket handle when a client connects
[17:50] zedas	uh, wait, let's go back to your idea for mongrel2: are you saying that handlers would get a request, make a whole new REQ socket, send that address to mongrel2, and then do REQ/REP to do a reply?
[17:50] ianbarber	nah
[17:50] ianbarber	so, simple case
[17:50] zedas	so, no matter how light you think zmq sockets are, they are totally not light.
[17:50] ianbarber	1 m2 - handler gets it's first message from m2, makes a socket, replies down that socket.
[17:50] ianbarber	second message, replies down the same socket
[17:51] ianbarber	if it gets a message from another m2, it makes a new socket, replies to that
[17:51] zedas	i already have to make the stacks in my server close to 120k because it creates so much ram on the stack it's sick.
[17:51] ianbarber	so the total number of sockets is total number of m2s talking to that handler
[17:51] ianbarber	if one dies, it'll fail to send on that socket, and handler can wait or cleanup as appropriate
[17:52] zedas	uh, haha, yeah that's not going to work and requires me to change how mongrel2 works internally to suit your latest idea
[17:53] ianbarber	at the moment, most handlers have config for which m2 to connect to in their SUB right?
[17:53] ianbarber	this would make it so you could add m2s transparently
[17:53] ianbarber	but yeah, i wasn't expecting you to rewrite the core m2 transport setup based on an irc chat :)
[17:54] zedas	but, listen to what you're saying. "Oh it's easy, you just send a socket address to the receiving side and then it makes a REQ/REP socket and uses that from then on."
[17:54] ianbarber	actually, i'm completely lying about the adding transparently, as they'd still need to connect the PULL
[17:55] zedas	ther's all sorts of problems with that. first, REQ/REP blows ass. they crash at the slightest little hickup in processing.
[17:55] ianbarber	use XREQ then, it's no bones
[17:55] ianbarber	I know it's a big change, but so's removing identites in ZMQ4
[17:55] ianbarber	i mean, it's not like this is some incremental tweak
[17:56] zedas	then, each handler just implicitly trusts an address it's given? no way. then it makes a socket and tries to connect to that addr, so now mongrel2 has to know all of the info needed for external connections to it?
[17:56] ianbarber	it already does - the PUB DSN is in the config
[17:56] zedas	and, now you're saying every handler has to use XREQ, i mean hell, why not just get rid of all the other sockets and say it's all ROUTER
[17:56] ianbarber	sorry, SUB DSN
[17:56] ianbarber	it would be same
[17:56] ianbarber	router would work too, but you don't need the functionality, XREQ is simpler
[17:57] zedas	no, it wouldn't because the huge difference is zeromq handles all the bullshit of SUB. you're basically saying i should reimplement that.
[17:57] zedas	and why? because you don't want to make a callback telling me when a client disconnects. a totally reasonable feature in a server.
[17:57] pieterh	zedas: what are you using pubsub for?
[17:58] zedas	well at the time it was the only reliable way to send to all N servers.
[17:58] zedas	and XREQ is too complicated for mere mortals to use.
[17:58] zedas	hell, you guys can't even document it well, and you wrote it.
[17:58] zedas	and REQ is too touchy, blows up or stalls at the slightest missed message or bad state
[17:59] sustrik	re
[17:59] sustrik	chatching up with the discussion...
[17:59] pieterh	zedas: so you might want to check out the Clone pattern in the Guide
[17:59] zedas	so that leaves, at the time, PUSH/PULL or PUB/SUB, and SUB had subscriptions so I could target a specific server with an id but send to all.
[17:59] pieterh	it's a complete reliable pubsub solution
[17:59] pieterh	with implementation in C and possibly other languages
[17:59] pieterh	distributed key-value cache
[17:59] zedas	does it involve handlers using XREQ/XREP?
[18:00] ianbarber	pieterh: the downside of clone for M2 is that it makes implementing handlers way more config, as M2 is inverted from normal, lots of of PUBs to few SUBS type of thing
[18:00] sustrik	ad ROUTER socket in 4.0: it reports connections/disconnections
[18:00] ianbarber	sustrik: excellent
[18:00] zedas	sustrik: thank you. if you got that then this entire conversation is pointless.
[18:00] sustrik	and the send blocks if hwm is reached
[18:00] pieterh	I already said this ages ago
[18:01] sustrik	yup, it's already implemented in the trunk
[18:01] pieterh	yes
[18:01] zedas	pieterh: no you said maybe and then tried to get me to redesign my stuff to work around not having it.
[18:01] pieterh	(07:42:40 PM) pieterh: by exposing connection/disconnection events, that work disappears
[18:01] pieterh	sorry, I wasn't clear
[18:02] pieterh	4.0 has a new super ROUTER socket that explicitly provides connect/disconnect events
[18:02] pieterh	and also, possibly (sustrik?) will provide independent sockets for each peer
[18:02] pieterh	allowing output polling on each connection independently
[18:03] ianbarber	sustrik: on a disconnect, if you send to a specific socket with a ROUTER will it drop the message or block?
[18:03] ianbarber	specific node i mean
[18:03] zedas	then that'll work, as long as handler authors don't have to implement ROUTER too
[18:03] zedas	seriously, you guys live and breath this stuff so it makes sense to you, but everyone else is very confused by ROUTER
[18:03] sustrik	inbarber: it will report an error if the specified peer doesn't exist
[18:03] ianbarber	sustrik: cool
[18:03] sustrik	if the hwm is reached, it'll block
[18:04] zedas	it's become this zeromq swissarmy knife that you use to solve all problems, so now it's overloaded with everything
[18:04] pieterh	zedas: yes, understood... if you have any ideas for making it simpler, always welcome
[18:04] pieterh	I'd personally like 1 socket per connection
[18:04] pieterh	it's much simpler for this work
[18:04] pieterh	and matches TCP more closely
[18:05] ianbarber	i would imagine it will have to be a ROUTER in the handler
[18:05] ianbarber	but yeah, the router is a bit conceptually simpler than it used to be, so maybe that wont be a problem? i guess that'll depend on the docs
[18:05] sustrik	yes, it's a swiss army knife, allows you to build any pattern you can think of
[18:06] pieterh	sustrik: moving to accept->socket would be good IMO
[18:06] zedas	pieterh: re: clone pattern, yes i read this, and it's flawed. You have clients using a REQ socket, so they block indefinitely and if the server dies mid-request they hang.
[18:06] pieterh	zedas: the clients use DEALER, not REQ
[18:06] sustrik	pieterh: what you get is a connection event with connection ID
[18:06] sustrik	kind of like socket
[18:07] pieterh	sustrik: we discussed this, there's no way to do accurate output polling
[18:07] zedas	pieterh: huh? in this? http://zguide.zeromq.org/page:all#A-Shared-Key-Value-Cache-Clone-Pattern
[18:07] zedas	also, DEALER? that's a new one too.
[18:07] zedas	when did that show up?
[18:07] pieterh	zedas: lemme take a look... DEALER's been around for a year,
[18:08] pieterh	it's new only if you don't follow the zeromq-dev list :-)
[18:08] ianbarber	picture looks to be out of date
[18:08] pieterh	it has seriously been discussed like on 10 threads
[18:08] ianbarber	zedas: DEALER = XREQ
[18:08] sustrik	router and dealer were just aliases for xrep/xreq
[18:09] zedas	pieterh: there's no DEALER in the clone pattern diagrams and a quick glance through later code doesn't mention it.
[18:09] zedas	ianbarber: ah, so it got a name change.
[18:09] pieterh	zedas: it's how the clone class (client API) is implemented
[18:09] pieterh	the diagrams are inaccurate and/or over-simplified
[18:10] pieterh	the clone pattern took 5 iterations to develop
[18:10] pieterh	sorry, six
[18:10] pieterh	"self->snapshotÂ =Â zsocket_new (ctx, ZMQ_DEALER);"
[18:11] zedas	pieterh: ok so what you're saying is, the diagrams are misleading, so then I'd have to read version 6 of the client to see you're using DEALER, which is an alias for XREQ, which brings me back to my original problem: XREQ/XREP are too complicated for most people new to zeromq.
[18:11] pieterh	well, this is chapter 5 of the Guide
[18:11] sustrik	clone = reliable pub/sub ?
[18:11] pieterh	yes
[18:11] zedas	i mean really guys, if i'm keeping track of client connections all the time in zeromq then it's basically a socket fd, just design it like that.
[18:11] pieterh	full distributed key/value cache
[18:11] pieterh	zedas: yes
[18:11] pieterh	sustrik: I have to go but zedas is right
[18:12] pieterh	the proper API for ROUTER is independent sockets per connection
[18:12] pieterh	addition of an accept() api call
[18:12] pieterh	proper POSIX compatible semantics for new/dead connections
[18:13] zedas	pieterh: i totally agree with that, and I also have to go to work. :-)
[18:13] sustrik	that's what TCP does
[18:13] pieterh	those semantics exist and are known
[18:13] pieterh	exactly
[18:13] sustrik	why bother with 0mq then
[18:13] pieterh	OMG sustrik
[18:13] pieterh	seriously?
[18:13] ianbarber	again?
[18:13] sustrik	the goal imo is to provice reusable patterns
[18:13] ianbarber	:)
[18:13] pieterh	do I have to spell that list of 0MQ features out AGAIN?
[18:13] pieterh	it's the 3rd time, dude
[18:13] ianbarber	right, i'm gonna go as well :)
[18:13] pieterh	get with the program here
[18:13] pieterh	:)
[18:14] sustrik	see you
[18:18] pieterh	sustrik: did you seriously forget that long list of things 0MQ does for you, which TCP doesn't, or are you trolling me?
[18:19] pieterh	"Dumb" as in, portable, with 30 language bindings, doingÂ framing,
[18:19] pieterh	multipart, async i/o, multiple transports via one API, high-water
[18:19] pieterh	marks, zero copy, etc.
[18:20] sustrik	once you move to TCP like behaviour some of those features will dissintegrate
[18:20] sustrik	for example, no PGM
[18:21] pieterh	this is purely an API design issue
[18:21] sustrik	HWMs become just a duplicate of SO_SNDBUF
[18:21] pieterh	you are basically inventing an API that POSIX already provides
[18:21] pieterh	that is bad for several reasons
[18:22] sustrik	i guess the only reason for 0mq existence with this model is TCP message delimitation
[18:22] sustrik	which could be done in much simpler way
[18:25] pieterh	ok, to be really clear, because I'
[18:25] pieterh	I'm reimplementing large chunks of the TCP layer in VTX
[18:25] pieterh	and I know the pain...
[18:25] pieterh	1. message framing
[18:25] pieterh	2. asynchronous message dispatch
[18:25] pieterh	3. connection to other transports via one API
[18:25] pieterh	4. performance
[18:26] pieterh	5. portability
[18:26] sustrik	1. yes
[18:26] pieterh	6. error handling
[18:26] pieterh	7. signal handling
[18:26] pieterh	7. multipart message handling
[18:26] sustrik	2. BSD sockets have that
[18:26] pieterh	did I already say 7? sorry, 8
[18:26] sustrik	3. BSD sockets have that
[18:26] pieterh	nope, they do not
[18:26] pieterh	sorry, they do not
[18:26] sustrik	4. BSD sockets have better peformance
[18:26] pieterh	except for small amounts
[18:26] pieterh	come on
[18:26] sustrik	5. BSD sockets have that
[18:26] sustrik	6. BSD sockets have that
[18:26] pieterh	I've spent the last 20 years writing TCP servers
[18:26] sustrik	7 yes
[18:27] pieterh	please don't trivialize this away
[18:27] pieterh	and please address my actual question instead of trolling
[18:27] sustrik	what's the question?
[18:27] pieterh	you are inventing a new API for socket presence / disconnection
[18:27] pieterh	it's not POSIX
[18:27] pieterh	that is bad for two or three reasons
[18:28] pieterh	your API is equivalent to one socket per peer
[18:28] pieterh	except worse
[18:28] pieterh	and more complex
[18:28] pieterh	and not standard
[18:28] sustrik	exactly
[18:28] pieterh	and confusing
[18:28] pieterh	well
[18:28] pieterh	why then dismiss the whole problem scope
[18:28] pieterh	instead of just fixing the API?
[18:28] sustrik	that's why i've been reluctant to add the thing to 0mq for so ling
[18:28] sustrik	long
[18:28] pieterh	shrug
[18:29] pieterh	I don't see why you make stuff that's really valuable and then screw up the API
[18:29] pieterh	either don't make it at all, or discuss with users and get the API right up front
[18:30] sustrik	i would rather drop it
[18:30] pieterh	fixing it a year later when everyone's invested in the old clunky API is... not efficient
[18:30] sustrik	but the reason i added it
[18:30] pieterh	drop it, drop 80% of use cases
[18:30] pieterh	your call
[18:30] sustrik	was that designing new patterns inside 0mq is costlu
[18:30] pieterh	you want users, or you dont
[18:30] sustrik	costly
[18:30] pieterh	mass is mass
[18:30] pieterh	you either make software that's relevant and that gives you mass
[18:30] pieterh	or you don't
[18:30] pieterh	but if you aim for mass, do it right please
[18:31] pieterh	I'm really not satisfied with the quality of the upfront design process
[18:31] pieterh	it's almost totally absent
[18:31] pieterh	and that is visible
[18:31] pieterh	very visible
[18:32] sustrik	once again, in this model the only value add of 0mq is message delimitation and multi-part messages
[18:32] pieterh	and high performance async io
[18:32] sustrik	if that's what you want i can code such library in 2 days
[18:32] sustrik	the performance is actually lower than BSD socket performance
[18:33] pieterh	raw BSD sockets don't do anything
[18:33] sustrik	?
[18:33] sustrik	they do exactly what you are asking for, no?
[18:33] pieterh	they only work rapidly in single threaded apps
[18:33] sustrik	ah, you meaninproc
[18:33] sustrik	right?
[18:33] pieterh	look, this is the wrong time and model for such discussions
[18:34] sustrik	ok
[18:34] pieterh	it is hugely frustrating to try to explain to you, of all people, the value 0MQ adds
[18:34] sustrik	the value is handing multiple connections in transparent manner
[18:34] pieterh	there is such a gap between my experience writing multithreaded applications, in anger
[18:35] pieterh	I've been doing this since 1991 or so
[18:35] sustrik	once you have to handle connections on one-by-one basis, the value-add mostly disappers
[18:35] pieterh	multithreaded communications servers
[18:35] sustrik	inproc, right?
[18:35] pieterh	now, when you tell me to "just use BSD sockets", you are being either massively naive, randomly insulting, or successfully trollish
[18:35] pieterh	I cannot figure out which it is
[18:35] sustrik	nope, i really don't have an idea
[18:36] sustrik	there are few nice details there
[18:36] sustrik	like specifying address as a string
[18:36] pieterh	I'll explain in an email
[18:36] pieterh	tomorrow
[18:36] sustrik	ok
[19:58] CIA-32	pyzmq: 03MinRK 07gh-pages * rac82456 10/ (106 files in 11 dirs): update to 2.1.8 - https://github.com/zeromq/pyzmq/commit/ac824568efd8c93d55a9b634c87630f6ba5a2b6f
[20:17] CIA-32	pyzmq: 03MinRK 07master * r8c12877 10/ README.rst : updated README for 2.1.8 - https://github.com/zeromq/pyzmq/commit/8c12877d09b8f00a5156ce918e2b2a214a18a8b1
[20:17] CIA-32	pyzmq: 03MinRK 07master * r795c55f 10/ README.rst : mention homebrew in README - https://github.com/zeromq/pyzmq/commit/795c55f1c54a4932d070125e1b324abf32195ba5
[20:17] CIA-32	pyzmq: 03MinRK 07master * r42b6035 10/ (3 files): add change notes to docs - https://github.com/zeromq/pyzmq/commit/42b6035657e0375c141bd8b986faa2eeb97136ba
[20:17] CIA-32	pyzmq: 03MinRK 07master * r4b322d5 10/ (5 files in 4 dirs): sphinx cleanup ...
[20:17] CIA-32	pyzmq: 03MinRK 07v2.1.8 * r18777a2 10/ zmq/core/version.pyx : release 2.1.8 - https://github.com/zeromq/pyzmq/commit/18777a2e468cb0a66e45b34a4fe472f2c67442ed
[20:23] fredix	hi
[20:30] mikko	hi
[21:16] fredix	hi
[21:17] fredix	why will you remove device ?
[22:03] evgeny_boger	Hi! I'm expierencing a kind of weird problem with zeromq. Would someone please try to help me?
[22:06] evgeny_boger	asyncsrv example with XREP/XREQ sockets just doesn't work on my ubuntu machine. Client just do not recieve any messages. I've tried both precompiled zeromq packages and source code ones. Both zmq 2.1 and 3.0. Both C and python samples. Both unix sockets ans tcp.
[22:06] evgeny_boger	And it 100% reproducable. And it works well on my Debian server out of the box. Any ideas?
[22:09] cremes	evgeny_boger: any error messages?
[22:09] evgeny_boger	No, not at all
[22:09] cremes	so what does it do? hang?
[22:10] evgeny_boger	XREQ side do not recieve any messages
[22:10] evgeny_boger	it could send the messages and this messages are properly recieved by XREP side
[22:11] evgeny_boger	So, for asyncserv example I just don't get the lines like "Client worker-6 received: request #7". Everything else is ok
[22:11] cremes	do any of the other examples work or is this the only one that fails for you?
[22:12] evgeny_boger	I didn't check all of them. But for instance hwserver/hwclient work well. As well as 'weather' example
[22:14] cremes	evgeny_boger: it's weird that it works on one box but not another...
[22:14] cremes	things to check... read the code to see what ports the example is using and then
[22:14] cremes	use netstat -an to verify that some other process hasn't already grabbed the port
[22:14] cremes	also, make sure you are using the exact same tarball on both boxes... don't install 0mq using
[22:15] cremes	the package manager because it may be out of date
[22:15] cremes	good luck
[22:15] evgeny_boger	thanks!
[22:16] evgeny_boger	the problem is definetely not related to ports since the same behaviour is for unix sockets. And it doesn't work for all the versions I've tried. That's weird. I don't even know how to send a bug report because I can't reproduce it on another box
[23:06] traviscline	pieterh: I'm the author of gevent-zeromq a compat lib for gevent - a coroutine library for python -- the examples that use zmq.Poller need modification to fit in an environment utilizing my library -- once I have a collection I'll probably submit a pull req. -- re: example dir Python-gevent or the like? don't see another example dir for a similar alternate implementation route
[23:07] mikko	traviscline: python-gevent sounds good
[23:07] traviscline	or would the likelihood of inclusion be too low?
[23:07] traviscline	rgr
[23:07] mikko	traviscline: i think if you submit fairly complete examples for the things they should be included
[23:08] mikko	there is no reason not to
[23:31] traviscline	pieterh: little start here https://github.com/traviscline/zguide/commit/556dbbc419cdfe5b97087fa1874141f5de756cd7 but i'll ping when i flesh out more