ZeroMq IRC Log

Thursday March 3, 2011

[Time] Name	Message
[01:16] cremes	ha, figured out why my zmq_connect() calls were returning an error
[01:17] cremes	i was trying to connect two sockets in different contexts via inproc
[01:17] cremes	that doesn't work :)
[02:58] cremes	i'm digging this inproc transport
[02:58] cremes	it reduces latency by at least 30%... me likey
[03:07] Steve-o	just need a Windows version
[06:51] Steve-o	mikko: pingu
[06:58] Steve-o	mikko: ready for bumping zeromq with autoconf of openpgm
[07:44] guido_g	mornig all
[07:44] pieterh	guido_g: g'morning
[07:44] guido_g	hi pieterh
[07:44] guido_g	just saw that you've been busy
[07:46] guido_g	will have a look at the mdp broker later
[07:48] pieterh	it does everything but leaks a little memory, and is slow
[07:50] guido_g	"The biggest improvement in performance is the non-working-to-working
[07:50] guido_g	transition."
[07:50] guido_g	-- John Ousterhout
[07:50] pieterh	what's nice is how it all handles when you stop/restart pieces
[07:51] guido_g	will use it as counterpart for my python mdp implementation
[07:52] pieterh	I'd start by implementing the client and worker APIs
[07:52] pieterh	you can then test against the existing broker
[07:53] guido_g	that was the idea
[07:53] guido_g	i'm on the client atm
[07:53] pieterh	I'm keen to see what you come up with
[07:53] pieterh	mdp will also solve the problem of reliable pipelines
[07:54] pieterh	today I profile to see where the CPU time is going
[07:54] pieterh	I suspect too much envelope stuffing
[07:54] guido_g	my idea is not to write another mdp example, but a lib that can be used for apps
[07:54] pieterh	this is what the two APIs are meant to be
[07:55] guido_g	so i figured that right at least :)
[07:55] pieterh	so an actual client looks like: https://github.com/imatix/zguide/blob/chapter4-wip/examples/C/mdclient.c
[07:55] pieterh	and the worker: https://github.com/imatix/zguide/blob/chapter4-wip/examples/C/mdworker.c
[07:56] guido_g	sweet
[07:56] pieterh	yeah, I think so too... :-)
[07:56] guido_g	no really
[07:56] pieterh	especially since it's like 3-4 days' total work so far from concept to running code
[07:57] guido_g	as one who is not used to c anymore, this is very easy to read
[07:57] pieterh	well, it's not about the language, just the semantics you implement
[07:57] pieterh	the class-based C style we developed at iMatix is pleasant to work with
[07:58] pieterh	i'm curious to see how that maps into Python
[07:58] guido_g	sure, but you need to know a language quite well to come up w/ a good api for something
[07:58] pieterh	indeed
[08:13] yrashk	I forgot this again -- in the latest 0mq can you or can you not use sockets in diff threads?
[08:13] yrashk	this was getting confusing at some point
[08:13] guido_g	you can, but you should do it very carefully
[08:13] guido_g	as always when using threads
[08:13] yrashk	what consitutes carefulness in this case?
[08:14] sustrik	no parallel access
[08:14] yrashk	so if mutexed around, we're fine?
[08:15] eyecue	moin
[08:15] guido_g	which means the same usage-pattern as before, but you can migrate a socket from thread to another
[08:15] guido_g	yrashk: no
[08:15] guido_g	yrashk: doesn't make sense
[08:15] sustrik	guido_g: it would work
[08:16] sustrik	basically it means you migrate the socket each time the mutex is locked
[08:16] eyecue	pieterh; nice coincidence, a vendor we're engaging today uses zeromq to arbitrate their mail and messaging infrastructure :)
[08:16] sustrik	yrashk: it's going to be slow
[08:16] yrashk	sustrik: yeah
[08:16] yrashk	mutexes add a penalty
[08:16] sustrik	yes
[08:16] guido_g	sustrik: technically yes, but you should be the one pointing out that it is broken by design and start to protect the innocent developers...
[08:16] pieterh	eyecue: nice
[08:16] eyecue	pieterh; taguchimail :]
[08:16] sustrik	exactly
[08:16] eyecue	in case youre interested
[08:17] sustrik	that's why we added scary stuff about memory barriers into docs :)
[08:17] guido_g	obviously didn't work
[08:17] guido_g	]:->
[08:17] eyecue	i might need to pick some irc brains soon for how i can apply 0mq to our stuff
[08:17] pieterh	sustrik: problem is that 'memory barriers' isn't very explanatory
[08:17] guido_g	i'm available for hire ,)
[08:17] sustrik	that's the point
[08:18] sustrik	for most people it translates to "don't do this"
[08:18] pieterh	sustrik: doesn't help to be mysterious
[08:18] sustrik	which is exactly the message
[08:18] pieterh	passing sockets from thread to thread is a valid pattern
[08:18] pieterh	bleh
[08:18] eyecue	pieterh; would this sentence make sense: i would get a developer to build a daemon'able queueing application using the 0mq library and add support for some form of persistant storage?
[08:19] pieterh	eyecue: depends whom you're talking to
[08:19] sustrik	pieterh: we can remove the memory barrier text
[08:19] pieterh	sustrik: the explanation needs to be precise
[08:19] sustrik	but actually, you do have to do the memory barrier
[08:19] pieterh	you can create a socket in one thread and pass that to another thread
[08:19] eyecue	pieterh; in the strictest sense that 0mq is a programming 'library' which one can use to then build daemons (instead of say a client app)
[08:19] pieterh	you MUST NOT read/write/close the same socket in multiple threads
[08:19] sustrik	the thing is that in 99% of cases it's done for you
[08:20] sustrik	so you don't have to care about it
[08:20] eyecue	pieterh; compared to say your beanstalkd's, which implement and provide a daemon out of the box
[08:20] pieterh	sustrik: not one person has hit a problem related to memory barriers afaics
[08:20] yrashk	the proble is that we DON'T WANT to do this
[08:20] pieterh	eyecue: yes, 0MQ is a toolkit with which you build frameworks
[08:20] yrashk	but we have to deal with the way erlang uses schedulers
[08:20] sustrik	right, so should we remove the text?
[08:20] yrashk	each scheduler is a differnet OS thread
[08:20] eyecue	pieterh; ta :]
[08:20] yrashk	and we can't tell Erlang to use one specific scheduler only
[08:20] pieterh	eyecue: look at zero.mq/md for an example, that's 3-4 days from concept to running code
[08:21] pieterh	with full APIs etc, and in C
[08:21] eyecue	roger that
[08:21] pieterh	that's less time than it takes to learn a conventional messaging API
[08:21] eyecue	pieterh; i read the guide last night, and the most intriguing part was the self-healing slash self adapting node network concept
[08:21] pieterh	sustrik: any text in the docs has to reference people's real questions and needs
[08:22] eyecue	pieterh; yeh, i got one of our django guys to take a look at 0mq today
[08:22] pieterh	sustrik: if you like I'll review that man page and propose changes
[08:22] eyecue	pieterh; i have a particular problem to solve where i dont know what the load of the input system will be ahead of time, nor do i want to care necessarily about the message/payload size (email bodies * mailing list members)
[08:23] sustrik	sure, give it a try
[08:23] eyecue	pieterh; i want to avoid serially queueing campaigns, and i dont really care about fair queueing, so im toying with the concept of an abstract fifo queue per campaign. the issue that i have is the idea that any campaign can 'start' or be scheduled at any time
[08:23] pieterh	eyecue: my advice would be to ask here for someone to spend a short time training & helping your team use 0MQ
[08:23] sustrik	reference should be complete though, so the memory barrier issue should be at least mentioned
[08:23] eyecue	pieterh; we're in .au, know anyone here? :)
[08:23] sustrik	maybe in a note?
[08:24] eyecue	pieterh; id be more than happy to entertain that idea
[08:24] eyecue	reads mdp
[08:25] pieterh	sustrik: do we know anyone in .au?
[08:25] sustrik	let me see
[08:25] eyecue	do you have have an active/developed advocacy network/framework/contact list going?
[08:26] sustrik	i have kind of vague feeling that we do, but not sure
[08:26] eyecue	pieterh; well, apart from taguchimail ;) i may go and pick his brain a bit (since we're already paying him)
[08:26] eyecue	haha very cute -> The Majordomo pattern has no relation to the open source mailing list software with the same name.
[08:27] pieterh	eyecue: someone pointed out the risk of confusion
[08:27] eyecue	that was me last night :]
[08:27] eyecue	too much beer? :]
[08:27] pieterh	ah!
[08:27] pieterh	hey, it's all one big blur...
[08:27] eyecue	amen, and so it should be
[08:28] eyecue	actually mdp may well solve for our problem domain
[08:28] eyecue	each worker can say how many active campaigns are available to dequeue
[08:28] eyecue	it could assign itself one of the 'service names'
[08:28] pieterh	eyecue: yes, mdp seems to cover a lot of problems
[08:28] eyecue	ensuring a 1:1
[08:29] sejo	if I would want to persist all the messages (untill they are processed) what backend would you suggest (need a fast backing)
[08:29] eyecue	pieterh; i could then interleave/loadbalance the MTA injection form that point
[08:29] pieterh	sejo: it's not that simple but you might look at Tokyo Cabinet or similar
[08:29] sejo	pieterh: tokio cabinet, mongodb, couchdb etc?
[08:29] eyecue	hmm, could even spawn new workers based on workload that mdp knows about
[08:30] eyecue	id say threadpool, but i only mean in concept.
[08:30] eyecue	this may be a kicker though -> Workers are idempotent, i.e. it is safe to execute the same request more than once.
[08:30] pieterh	eyecue: that's a matter of shared database
[08:30] eyecue	we need to ensure non-duplicate delivery of mail/messages
[08:31] pieterh	sure
[08:31] eyecue	im not sure how it applies, but yeh
[08:31] guido_g	sejo: redis for speed only, if you have the memory
[08:31] sejo	guido_g: needs to be persistable and distributed :p
[08:31] sejo	(not that I'm asking a lot :p)
[08:32] guido_g	sure, go read
[08:32] pieterh	sejo: what're the performance requirements?
[08:32] guido_g	sejo: mongodb is quite fast too
[08:32] pieterh	eyecue: if you're up for sponsoring work on MDP and/or broker implementations, drop me a line
[08:32] sejo	pieterh: 100-500 reqs a second
[08:32] sejo	might even go up
[08:32] pieterh	sejo: can you wait a week or two? I'm working on a rust-based reliability pattern
[08:33] pieterh	fire and forget
[08:33] eyecue	pieterh; i was just about to ask you about 'pausing' queues, or something semantically similar. i note that some of your socket types block or drop, depending on which is chosen, so that may be a way
[08:33] pieterh	client sends to rust-based broker
[08:33] pieterh	broker then sends to workers
[08:33] pieterh	whole thing is brute-force ack'd
[08:33] sejo	cool, looks what I need, but sorry no time for wait (startup here and my money isn't unlimited)
[08:34] pieterh	sejo: where do you want your persistence, in client, or in center?
[08:34] sejo	on the broker
[08:34] pieterh	ugh
[08:34] pieterh	that's the worst choice
[08:34] sejo	worker should just execute and send results
[08:35] pieterh	so the problem here is that when you want the broker to hold the state
[08:35] sejo	well not much states just a group of key-value's
[08:35] eyecue	im out, thanks for the help pieter
[08:35] pieterh	you need extra work to speak to that state reliably
[08:35] pieterh	eyecue: ciao
[08:36] pieterh	sejo: if you place the persistence in the client API, for example
[08:36] pieterh	you can use dumb brokers
[08:37] sejo	ok, basicly i need a group of brokers (that share data) which is pulled by workers, the same workers will push new tasks generated from the result to the broker. State is kept in worker, the minute it reaches the broker, only the messagedata is state
[08:37] pieterh	sure
[08:38] sejo	btw redis is master-slave
[08:39] pieterh	sustrik: random question about style
[08:39] sejo	couchdb is master:master
[08:39] pieterh	the man pages are written in the form of specifications
[08:39] pieterh	(which is excellent IMO)
[08:40] pieterh	to make this more contractual, we could use http://tools.ietf.org/html/rfc2119
[08:41] pieterh	e.g. "Applications MAY create a socket in one thread with _zmq_socket()_ and then pass it to a _newly created_ thread as part of thread initialization, for example via a structure passed as an argument to _pthread_create()_. Applications MUST NOT do stupid stuff."
[08:44] sejo	nice did you guys use couchdb already?
[08:45] sejo	looks pretty good
[08:54] pieterh	sustrik: ok, patch to zmq_socket sent
[08:54] pieterh	that should help IMO
[08:56] djc	sejo: couchdb is awesome, we use it a lot at work
[08:57] sejo	djc: Yeah going to use it also
[08:57] sejo	already started playing with it :p
[08:57] sejo	=80
[08:58] sustrik	pieterh: does that help in any way?
[08:58] sustrik	it doesn't answer yrashk's question
[08:58] pieterh	does what help?
[08:58] sustrik	the patch
[08:58] pieterh	what's yrashk's question? sorry, I missed something
[08:58] sustrik	the one that started the disussion
[08:58] sustrik	whether sockets can be used from 2 threads
[08:59] pieterh	" can you or can you not use sockets in diff threads?"
[08:59] pieterh	I think the patch makes this very clear
[08:59] sustrik	then the argument was that speaking about memory barriers is not clear
[08:59] pieterh	sure
[08:59] pieterh	the man page started by talking about contexts
[08:59] pieterh	now it says "sockets are not thread safe. period".
[08:59] sustrik	and that haven't changed
[09:00] sustrik	?
[09:00] sustrik	+0MQ 'sockets' are _not_ thread safe. Applications MAY create a socket in one
[09:00] sustrik	+thread with _zmq_socket()_ and then pass it to a _newly created_ thread as
[09:00] sustrik	+part of thread initialization, for example via a structure passed as an
[09:00] sustrik	+argument to _pthread_create()_. Applications MUST NOT otherwise use a socket
[09:00] sustrik	+from multiple threads except after migrating a socket from one thread to
[09:00] sustrik	+another with a "full fence" memory barrier.
[09:00] pieterh	the first sentence is the most important
[09:00] sustrik	yep, that's ok
[09:00] pieterh	the second sentence provides the ONE valid use case
[09:00] pieterh	the third sentence explains it for those who care
[09:00] pieterh	and fourth, some bla blah about contexts
[09:00] sustrik	why singling out one valid use case?
[09:01] pieterh	it's the only one I know of
[09:01] pieterh	for normal apps
[09:01] sustrik	that's more of a guide stuff
[09:01] pieterh	that was already in the man page... I just trimmed it a little
[09:01] pieterh	plus it really does need to be there
[09:01] guido_g	ha! client just passed first send unittest
[09:01] sustrik	you mean the memory barrier stuff?
[09:02] sustrik	yes
[09:02] sustrik	the one use case should be moved to guide imo
[09:02] pieterh	sustrik: please, no
[09:02] pieterh	don't make life harder for users than it has to be
[09:02] pieterh	i agree that this is explanatory and not a specification
[09:02] pieterh	but this is so important for MT apps
[09:03] pieterh	you can't afford to hide it somewhere in a 1000-page book
[09:03] sustrik	i mean there are many use cases, so singling one of them out gives wrong impression
[09:03] sustrik	what about making it an example?
[09:03] pieterh	there are not many use cases!
[09:03] pieterh	sorry,
[09:03] pieterh	that's just not accurate
[09:03] pieterh	i've hit precisely one, in 40-50 examples that cover every angle
[09:04] pieterh	you may imagine use cases, that's not the same
[09:04] sustrik	garbage collecting the socket is pretty common
[09:04] pieterh	if you fix the inproc connect/bind issue, this use case disapepars
[09:04] pieterh	*disappears
[09:04] pieterh	that would be ideal IMO
[09:05] pieterh	only 5 people, globally, will ever write a socket garbage collector
[09:05] pieterh	maybe 10, ever
[09:05] sustrik	wait a sec, you are saying that this use case is necessary?
[09:05] sustrik	how so?
[09:05] pieterh	yes
[09:05] sustrik	you can pass context to the other thread
[09:05] pieterh	because there's no other way to create a working inproc socket pair
[09:05] sustrik	and create the socket there
[09:05] sustrik	no?
[09:05] pieterh	due to the bind/connect issue
[09:05] pieterh	sustrik: have you read the Guide?
[09:05] sustrik	yes
[09:06] pieterh	read it again until you understand 0MQ
[09:06] pieterh	:-)
[09:06] pieterh	heh
[09:06] sustrik	what's wrong with this:
[09:06] sustrik	c = context()
[09:06] sustrik	s = socket (c);
[09:06] sustrik	s.bind (...);
[09:06] sustrik	pthread_create (c);
[09:07] sustrik	and in the worker thread:
[09:07] sustrik	s = socket(c);
[09:07] sustrik	s.connect (...)
[09:07] pieterh	the example in the Guide has 3 stages
[09:07] sustrik	same thing, no?
[09:08] pieterh	let me double check
[09:08] pieterh	if there is a valid pattern, this use case disappears
[09:09] pieterh	ok, you're right afaics
[09:09] pieterh	bind before creating child threads
[09:09] pieterh	I need to change some stuff in the guide
[09:10] pieterh	we can indeed remove that example from the man page... hang on a sec then...
[09:12] pieterh	sustrik: patch sent, this is much cleaner
[09:12] sustrik	ok, thanks
[09:13] sustrik	pieterh: you've sent the old patch i think
[09:14] sustrik	looks the same
[09:14] pieterh	git did something... hang on
[09:15] pieterh	weird, git produces the wrong patch
[09:16] sustrik	never mind, i can do it myself
[09:16] sustrik	it's just reversing two paragraphs
[09:16] pieterh	sigh, just use this paragraph:
[09:16] sustrik	and adding a period
[09:16] sustrik	right?
[09:16] pieterh	0MQ 'sockets' are _not_ thread safe. Applications MUST NOT use a socket
[09:16] pieterh	from multiple threads except after migrating a socket from one thread to
[09:16] pieterh	another with a "full fence" memory barrier.
[09:16] pieterh	and move the context stuff below
[09:17] pieterh	i forgot a 'git add'
[09:17] pieterh	too many little bitty steps
[09:18] pieterh	BTW "full fence" is the proper jargon afaics
[09:18] sustrik	possibly
[09:18] sustrik	btw, we've changed the wording a bit
[09:19] sustrik	but haven't addressed the original issue
[09:19] sustrik	that speaking about 'fences' is mysterious
[09:19] sustrik	rather than helpful
[09:19] pieterh	the question was 'are sockets thread safe'?
[09:19] pieterh	and the answer is 'no'... how is that not answering it?
[09:19] sustrik	the docs said so even before
[09:19] pieterh	they hid that in a lot of other text
[09:20] pieterh	obviously people didn't see it clearly
[09:20] sustrik	ok
[09:20] sustrik	the period helps
[09:21] pieterh	shrug, I'm not sure what you're asking, yrashk's question was clear, and it seems clear the man page had way too much wrapping around the essential statement, "don't do it, but if you must, use full fence memory barriers"
[09:23] sustrik	well, the only real change seems to be the period; what i'm asking is: should it say something like "you can use socket from multiple threads given you synchronise the access"
[09:24] pieterh	people will start using mutexes all over the place
[09:24] sustrik	it's antipattern in most cases, but technically, the statement is sound
[09:24] sustrik	exactly
[09:24] sustrik	that was the original discussion, whether it's ok to scare people using terms like "memory barrier"
[09:25] sustrik	or rather be technically precise
[09:25] pieterh	i don't think fear is a valid tool
[09:25] pieterh	this is a contract
[09:25] pieterh	it should simply state what is allowed, and what is not
[09:25] sustrik	ok, then it should be "you can use socket from multiple threads given you synchronise the access"
[09:26] sustrik	that's more comprehensible than memory barrier stuff
[09:26] pieterh	every contract aims to force the signer to behave in some way
[09:26] pieterh	look, we don't want people to share sockets between threads, period
[09:26] pieterh	it's the cause of repeated failures
[09:27] sustrik	ok, good
[09:27] pieterh	we see one or two bizarre 0MQ crash reports a week due to this
[09:34] pieterh	I'm ripping out all explanation of socket migration from the guide, we'll use the pattern you explained, bind before starting child thread
[09:49] sejo	if you have a zmq.PULL and recv() something, is it possible to know what worker sent that?
[09:49] pieterh	sejo: you have to add the information yourself to the message
[09:50] pieterh	otherwise, use a ROUTER (XREP) socket instead
[09:50] sejo	pieterh: ok good! thanks!
[09:50] sejo	I think it'll be better to put it in the message
[09:52] pieterh	sustrik: so, I've removed all examples of socket migration from the guide
[09:53] pieterh	your pattern is actually much clearer and simpler than passing sockets around
[10:03] sustrik	ok, good
[10:03] sustrik	i'll update the reference accordingly
[10:58] guido_g	re with new internet :)
[11:03] pieterh	sustrik: is there any recommended way to trap Ctrl-C in a 0MQ program (C++ or C)?
[11:04] sustrik	standard C way
[11:04] sustrik	no specifics
[11:04] pieterh	ok, I'll give it a shot...
[11:16] CIA-21	zeromq2: 03Martin Sustrik 07master * r97add1e 10/ (doc/zmq_init.txt doc/zmq_socket.txt):
[11:16] CIA-21	zeromq2: Documentation wrt thread-safety cleaned up.
[11:16] CIA-21	zeromq2: Signed-off-by: Martin Sustrik <sustrik@250bpm.com> - http://bit.ly/dLdfr0
[11:25] pieterh	sustrik: cool, signal handling works perfectly to shut down 0MQ...
[11:25] pieterh	I'll document it, it's simple and clean
[11:38] sejo	hmm I created a small zmq.PULL and try to send with telnet a message to it... when debugging it it never gets to recv()
[11:38] AlexB	Hello. :) I've found an issue with having inproc sockets that are bound but without any connected end-point. With recent versions of ZeroMQ and PyZMQ, closing a context with socket like that hangs. Here's a short test-case: https://gist.github.com/852644
[11:39] CIA-21	zeromq2: 03Martin Sustrik 07master * r184bdb8 10/ src/xrep.cpp :
[11:39] CIA-21	zeromq2: Bug caused by interaction of REQ/REP routing and HWM fixed.
[11:39] CIA-21	zeromq2: Signed-off-by: Martin Sustrik <sustrik@250bpm.com> - http://bit.ly/ec5ykN
[11:39] guido_g	sejo: Ã¸mq does have its own wire-format
[11:40] sejo	guido_g: ach ok, so i'd better write my client then :p
[11:40] sejo	thanks!
[11:40] pieterh	AlexB: this is a known issue
[11:40] pieterh	you need to close the socket before closing the context
[11:46] AlexB	I see. I found it in the docs now.
[11:46] AlexB	Thanks. :)
[11:50] pieterh	sustrik: can 184bdb8 go to 2.0.x?
[11:52] sustrik	yes, if applicable
[11:53] pieterh	:-) do you really trust me to figure that out and make the change?
[11:53] pieterh	the code in xrep.cpp looks similar enough
[11:54] pieterh	but I'm not comfortable backporting patches that I did not write myself
[11:54] private_meta	pieterh: hmm... renaming the .c files to .class somewhat broke online viewing on github :/
[11:54] sustrik	this is why i said maintaining a stable branch needs a dedicated person :)
[11:55] pieterh	sustrik: the process can work fine if we are consistent about patch flow
[11:55] pieterh	i.e. anyone wants a patch to version X they send it
[11:55] pieterh	separate the work of making the release, and making the code
[11:55] pieterh	it should be two separate hats / people
[11:56] pieterh	imagine it's not you...
[11:56] pieterh	who would be best to backport OpenPGM 5 support to 2.0?
[11:57] pieterh	clearly, the person who made that work in 2.1
[11:57] pieterh	private_meta: well, you were the one trying to compile these classes
[11:57] guido_g	private_meta: i bet that's because .class files are compiled .java files
[11:57] pieterh	lol java
[11:58] pieterh	private_meta: indeed, nothing shows at all... :-/
[11:58] guido_g	so another rename...
[11:58] pieterh	ok, alternate suggestions?
[11:58] private_meta	pieterh: Well, I wouldn't mind just ".h"
[11:58] private_meta	if it were c++, I'd go with ".hpp" with those classes
[11:59] pieterh	i suppose it's consistent with zhelpers.h
[11:59] private_meta	It would be, yes
[11:59] pieterh	sigh, 'clever' software...
[12:00] private_meta	heh
[12:00] guido_g	hehe
[12:00] pieterh	sustrik: if you want that patch to go into 2.0.11 (which it should IMO), please spend 2 minutes at https://github.com/zeromq/zeromq2-0/blob/master/src/xrep.cpp
[12:00] pieterh	it's not going to work if you ask the release maintainer to do backports
[12:01] pieterh	(a) I refuse, it would be insane, and (b) it's impossible if you have a real community of developers
[12:01] pieterh	just like you ask contributors to provide you with patches, I'm asking you
[12:01] pieterh	pretty please
[12:07] sustrik	well, i don't have resources to maintain 3 branches, sorry
[12:07] sustrik	i've warned you
[12:12] pieterh	this is not about maintaining branches, martin
[12:12] pieterh	it's about how we backport changes when we have N contributors and N release versions
[12:12] pieterh	did you understand my example of OpenPGM?
[12:13] pieterh	if you want this to scale, it's the only way I can see
[12:13] pieterh	it's literally two minutes for you, you know the code perfectly, it's a 10-line patch
[12:14] pieterh	and I take care of delivering that to the user community
[12:14] pieterh	but if you comingle these two tasks, there will never be properly maintained stable releases
[12:14] pieterh	period
[12:14] pieterh	do you have another suggestion?
[12:15] sustrik	it's not a 2 minute work
[12:16] sustrik	the code has changed in the meanwhile and works somewhat differentlu
[12:16] sustrik	so it needs, careful patching, testing etc.
[12:16] sustrik	backporting is hard work
[12:16] pieterh	this is perfectly acceptable as an answer
[12:16] pieterh	"no"
[12:18] pieterh	you have two sets of people, those able to make release packages (follow procedures perfectly) and those able to backport patches
[12:18] pieterh	the intersection of those two sets is close to zero, if not literally zero
[12:18] pieterh	that's the major issue here
[12:18] Steve-o	:O
[12:18] pieterh	in the 0MQ world, specifically
[12:19] Steve-o	backporting networking code is hard and tedious work
[12:19] jsimmons	backporting sucks everywhere
[12:19] pieterh	it gets much worse if you consider a 'release' to include multiple projects
[12:19] Steve-o	I did it once for you guys with pgm 5.0, not again :/
[12:20] jsimmons	forwardport dont go backwards :D
[12:20] Steve-o	I only maintain critical fixes to old branches now
[12:20] pieterh	well, the way I see it, someone wants fix X on version Y
[12:20] pieterh	customer needs it, perhaps
[12:21] pieterh	so the backport has an economic basis
[12:21] Steve-o	which is what support contracts are about
[12:21] pieterh	yes
[12:22] pieterh	my goal with 2.0 and 2.1 is to accept and manage backports provided by people who want them to be in a specific released version
[12:22] pieterh	so, sustrik, my question was really "do you want 184bdb8 to go into 2.0.x?"
[12:24] Guthur	surely it a customer wants a new feature you say 'upgrade'
[12:27] Steve-o	Looks like mikko not about today?
[12:30] private_meta	pieterh: A question about zmsg_recv in the zmsg.[h\|class] file. You exit the entire application if a message isn't received, is that intentional?
[12:30] pieterh	private_meta: nope, I've changed that, just now
[12:30] private_meta	oh
[12:30] pieterh	it should return NULL
[12:30] private_meta	ok
[12:30] private_meta	I'll think of something equivalent in C++
[12:37] sustrik	pieterh: it would be nice if someone backported it
[12:38] pieterh	sustrik: that's what I felt but someone has to want to, enough to do it, and then convince the maintainer (me) to accept the patch
[12:38] pieterh	it's the same workflow for all projects, right?
[12:38] pieterh	I think there are compromises possible here...
[12:38] sustrik	yes, i think so
[12:38] pieterh	for example, I'm happy to test a release heavily before it goes out
[12:39] pieterh	but that demands proper test cases for every (new) patch
[12:39] sustrik	right, lot of work
[12:39] pieterh	I'm happy to coordinate with mikko to ensure the rc builds
[12:39] pieterh	divide the work, then it's not so much
[12:39] pieterh	the only thing I totally refuse to do is patch code I did not write
[12:40] pieterh	not in stable production releases
[12:40] sustrik	so no backports?
[12:40] pieterh	again: divide the work
[12:40] sustrik	then i would suggest dropping the stable repos and using tags on master instead
[12:40] pieterh	author of code can confidently do a backport if there is incentive
[12:40] pieterh	sigh
[12:40] sustrik	ah, yes, sure
[12:41] pieterh	if you insist on mixing the two roles, we're going to get badly stuck
[12:41] pieterh	that's MHO
[12:41] pieterh	s/you/we/
[12:42] pieterh	look, you provide me with a patch to 2.1, it applies cleanly, ok
[12:42] pieterh	I'm still going to test the thing before it's released
[12:42] pieterh	but you can't assume I know the code well enough to do that manualluy
[12:42] pieterh	*manually
[12:42] pieterh	so there must also be proper test cases for new patches
[12:43] pieterh	all of this makes it possible to scale
[12:43] pieterh	if you depend on unique people able to do everything, you cannot scale
[12:43] pieterh	s/you/we/
[12:43] sustrik	ok, let me explain my pov
[12:44] pieterh	shoot
[12:44] sustrik	it's a problem of cost
[12:44] sustrik	the process we've had before was the minimal-cost process
[12:44] pieterh	in what sense was is cheap before?
[12:44] pieterh	*it
[12:44] sustrik	no backporting
[12:44] sustrik	there was maint
[12:45] pieterh	well, there should have been backporting of critical fixes
[12:45] sustrik	the patches to maint were upstreamed to master
[12:45] sustrik	but not other way round
[12:45] pieterh	... ok, and how often did that happen?
[12:45] pieterh	last fix to maint was 1 december afaict
[12:45] sustrik	rarely
[12:45] sustrik	that's why the cost was low
[12:45] pieterh	the process was cheap because the process was never used
[12:45] sustrik	so the user was faced with following choice:
[12:45] pieterh	hey, I can do that now as well
[12:46] pieterh	and we can upstream today just as easily
[12:46] pieterh	it's no different
[12:46] sustrik	1. stay with maint, patch the problem there, submit the patch
[12:46] sustrik	2. move to master
[12:46] pieterh	1. stay with 2.0, patch the problem there, submit the patch
[12:46] pieterh	2. move to 2.1, patch the problem there, submit the patch
[12:46] pieterh	3. move to 2.2 master, etc...
[12:46] sustrik	yeah
[12:46] pieterh	it is exactly the same process
[12:47] sustrik	nope, it's unidirectional
[12:47] pieterh	ok, theoretical question
[12:47] pieterh	if someone provides a patch to 2.0, how does that get upstreamed?
[12:47] pieterh	who does that work?
[12:47] sustrik	myself
[12:48] sustrik	i'm maintainer of master
[12:48] sustrik	so i want it to be bug-free
[12:48] pieterh	but you're not following the old releases
[12:48] pieterh	how do you even know there was a change?
[12:48] sustrik	so if someone finds a bug in maint and submits a patch
[12:48] sustrik	i want it to get to master
[12:48] sustrik	all the patches flew via myself
[12:48] pieterh	ah
[12:48] pieterh	so you also maintain maint?
[12:49] pieterh	and if there are three branches, you maintain all 3?
[12:49] pieterh	afaics we agree on the problem and the solution
[12:50] sustrik	'maintain' in terms of applying patches that i get from other people
[12:50] pieterh	yes, but you don't want to get patches to old versions, do you
[12:50] pieterh	because that makes you maintainer of those
[12:50] sustrik	git apply
[12:50] sustrik	git commit
[12:50] sustrik	git push
[12:50] sustrik	easy
[12:51] pieterh	that assumes the patch is tested and perfect
[12:51] pieterh	ok, let's assume that
[12:51] sustrik	yes, that's up to submitter
[12:51] pieterh	so we do the same today, except you don't have to do that
[12:51] pieterh	the patch must be tested and perfect
[12:52] pieterh	then anyone who follows the process can apply/commit/push
[12:52] pieterh	does not require your time
[12:52] pieterh	that then becomes scalable
[12:52] sustrik	it's up to you what process you use for stable branches
[12:53] pieterh	and if the patch is not perfect, you don't want to be fixing up old branches, do you
[12:53] pieterh	retesting 2.0?
[12:53] pieterh	so what we've done with the separate gits, and why we both like it
[12:54] pieterh	is that you have delegated the grunt work of applying perfect patches to old versions
[12:54] pieterh	followed by documentation, release packaging, uploading, emails, etc. etc.
[12:54] pieterh	ack?
[12:54] pieterh	second aspect is upstream vs. downstream
[12:55] pieterh	totally orthogonal, this applies equally if we had 1 repo or 3 repos
[12:55] sustrik	yeah, i've delegated hat work to you
[12:55] sustrik	now i don't care about it amymore
[12:55] pieterh	yes, and that is a big improvement
[12:55] sustrik	life is beautiful now :)
[12:55] pieterh	well, that is what I call a saving in time and cost
[12:55] pieterh	so now we look at the orthogonal issue of up/down streaming
[12:56] pieterh	as maintainer, I'll accept perfect patches from anyone I trust
[12:57] pieterh	I don't necessarily see 2.0 as downstream from 2.1, they act as two independent projects, almost
[12:57] sustrik	sure, it's up to you
[12:58] pieterh	do you understand why I want to separate the role of coding from release management?
[12:58] ianbarber	in that process there must be an agent who wants a patch from <new version> in <maintenance version> and is willing to do the work to test and submit the perfect patch. But, the people most likely to be trusted by the maintainer are also mostly likely to be working on the later/latest versions, I would guess.
[12:59] ianbarber	and the people that are on the older versions don't necessarily have the awareness that there are improvements that would benefit them
[12:59] pieterh	ianbarber: to a point, but if we are smarter, it scales better
[12:59] pieterh	for example, better test cases
[12:59] ianbarber	yeah
[12:59] sustrik	it's hard to get someone to do that kind of work
[12:59] sustrik	linux has GregKH
[12:59] pieterh	sustrik: there has to be an economic incentive in every case
[12:59] ianbarber	better test cases would let you apply patches more speculatively
[13:00] ianbarber	if you could know things will shout if you break stuff
[13:00] sustrik	but i would guess people like that are pretty rare
[13:00] pieterh	ianbarber: indeed, I'd rather trust a test case I can understand than someone's patch
[13:00] ianbarber	definitely.
[13:00] pieterh	sustrik: not only rare, but counter-productive in the end
[13:00] pieterh	if you know too much, you cut corners
[13:01] pieterh	being naive forces better process in many cases
[13:01] pieterh	e.g. I don't know the internals so I insist on better test cases
[13:01] sustrik	having a dedicated maintainer of stable is counter-productive?
[13:01] pieterh	if I know the code really well, I won't use test cases
[13:02] pieterh	and the product and process will suffer (this has already happened a lot)
[13:02] pieterh	sustrik: you say "maintainer" to mean two different things in my view
[13:02] pieterh	I am the dedicated maintainer but not in the sense of "I own the code"
[13:02] pieterh	I own the release and the process
[13:03] pieterh	the code is owned by its authors
[13:03] sustrik	maintainer in terms of "i am willing to spend all my free time to maintain stable"
[13:03] pieterh	define "maintain" please
[13:03] sustrik	that's pretty rare i would say :)
[13:04] pieterh	look, I maintain the Guide, yes?
[13:04] pieterh	that is, I'm responsible for the repository, publishing it, and I write the text
[13:04] pieterh	but every set of translations has their authors
[13:05] pieterh	they own that code
[13:05] pieterh	I don't maintain it
[13:05] sustrik	this discussion is pointless
[13:05] sustrik	the process in stable is entirely up to you
[13:05] pieterh	if you want that patch to go into 2.0, send it my way, thanks
[13:05] Guthur	pieterh: Do you think it would be nice to have some readily available statistic of the guide example coverage for each language binding?
[13:06] pieterh	Gurthur: hang on...
[13:06] Guthur	it's ok i'm not going anywhere soon
[13:06] Guthur	hehe, and I know you guys are having a heated debate
[13:06] pieterh	Translations (48 in total):
[13:06] pieterh	Ada 0, 0%
[13:06] pieterh	Basic 0, 0%
[13:06] pieterh	C 48, 100%
[13:06] pieterh	C++ 33, 68%
[13:06] pieterh	C# 26, 54%
[13:06] pieterh	Common Lisp 32, 66%
[13:06] pieterh	Erlang 0, 0%
[13:06] pieterh	Go 2, 4%
[13:06] pieterh	Haskell 0, 0%
[13:06] pieterh	Java 17, 35%
[13:06] pieterh	Lua 0, 0%
[13:06] pieterh	Objective-C 0, 0%
[13:06] pieterh	ooc 0, 0%
[13:06] Guthur	oh whao
[13:07] Guthur	I wasn't expecting there to be any
[13:07] Guthur	Would that be worth making visible, do you think?
[13:08] pieterh	where?
[13:08] Guthur	top of the guide maybe
[13:08] pieterh	hang on...
[13:08] pieterh	I'll post it next time I do an update, if I remember
[13:08] Guthur	cool
[13:09] Guthur	I need to bump that C# number up a bit, hehe
[13:21] private_meta	pieterh: about zmsg, is it made to cater other types but standard character strings?
[13:21] pieterh	nope
[13:21] pieterh	this is not meant to be production code, it's for the examples
[13:21] private_meta	pieterh: in the dump function you differ between non-character blob and string, that's why I ask
[13:22] pieterh	yes, because a socket can get anything
[13:22] pieterh	the class could be expanded to allow binary message parts
[13:22] pieterh	but that gets complex quickly, and was not worth it for here
[13:22] pieterh	I'd do it in zfl_msg, which is a more serious implementation
[13:23] private_meta	hmm ok. If it were string only, i would use std::string and the implementation would be a little more clean
[13:23] private_meta	but the example should follow the same functionality as the C version i guess
[13:24] pieterh	you can implement it any way you like, I'd suggest keeping the API similar so people can follow
[13:25] pieterh	the methods that muck with envelopes... wrap/unwrap/push/ etc.
[14:38] guido_g	pieterh: ping?
[14:38] pieterh	hi guido_g
[14:38] guido_g	question about mdp spec
[14:39] pieterh	shoot
[14:39] guido_g	worker send ready message to broker
[14:39] guido_g	there will be no response to that, right?
[14:39] pieterh	there is a response but it's not immediate
[14:39] pieterh	it's the lru pattern
[14:40] guido_g	i'm talking about the ready message
[14:40] pieterh	worker waits, indefinitely, for a client request to arrive, that's the response
[14:40] guido_g	from the worker
[14:40] guido_g	to the broker
[14:40] guido_g	aka handshake
[14:40] guido_g	ok, half of a handshake
[14:40] pieterh	yes, I do understand what you are asking
[14:40] pieterh	it is the lru pattern
[14:41] guido_g	i can't find any answer to the ready message is the spec
[14:41] guido_g	*in the spec
[14:41] pieterh	there is no answer to the ready spec, as such
[14:41] pieterh	the first REQUEST is the answer
[14:41] guido_g	ahh we're getting closer
[14:41] pieterh	a DISCONNECT would be a negative acknowledgement
[14:41] pieterh	sorry, when I say "lru pattern" does that make any sense?
[14:42] guido_g	ok, so till a request arrives the woker does not know if the broker has his service registered correctly, right?
[14:42] guido_g	no it doesn't, because we're on mdp protcol level
[14:42] pieterh	it can safely assume all is OK unless it (a) gets a disconnect or (b) no heartbeat within whatever time
[14:43] pieterh	ok, good point, I'll add this explicitly
[14:43] guido_g	so a received hb from the broker will count as "registration is fine"
[14:44] guido_g	so we need a basic timeout value for the handshake
[14:44] guido_g	or a dedicated "got your ready" reply
[14:45] pieterh	what is the value of a dedicated "OK" command?
[14:46] pieterh	I'm not against it, but it seems unnecessary
[14:46] guido_g	the value would be that the worker from this point on knows that his service is registered and he does receive heartbeats etc.
[14:47] pieterh	what is the value of the worker knowing?
[14:47] guido_g	would allow for small timeouts in the handshake phase
[14:47] pieterh	I'm being serious, remember we're on disconnected TCP
[14:48] pieterh	so the broker can arrive 30 minutes late
[14:48] pieterh	do we want to break that? I'd rather not, it's valuable
[14:48] guido_g	wouldn't break that, just needs a longer timeout
[14:48] pieterh	which is the heartbeat
[14:49] guido_g	right
[14:49] pieterh	so you don't need a positive ack
[14:49] guido_g	but not the "i'm ok" hb, but the one for the connection handshake
[14:49] pieterh	a negative ack gives you just the same semantics
[14:50] guido_g	example: no broker , worker starts, broker comes 30 seconds later
[14:50] guido_g	general hb interval is 1 sec
[14:50] guido_g	client died due to lots of missed hbs
[14:50] pieterh	worker will reconnect over and over and eventually get a HB back
[14:50] pieterh	clients have different timescales
[14:51] pieterh	the client could easily retry like the worker does
[14:51] pieterh	that's not MDP's problem
[14:51] guido_g	shit, s/client/worker/
[14:51] pieterh	worker does not die, if you look at my API
[14:51] guido_g	again
[14:52] pieterh	it keeps reconnecting forever (or until Ctrl-C in my latest code)
[14:52] guido_g	i do look at the spec
[14:52] pieterh	right
[14:52] pieterh	"If the worker detects that the broker has disconnected, it MUST restart a new conversation."
[14:52] pieterh	I'm not sure that's accurate
[14:52] guido_g	ok
[14:53] pieterh	but it allows the broker to kick workers that come in without doing 'ready'
[14:53] guido_g	would mean: after x missed hbs close socket and start from the beginning
[14:53] pieterh	yes
[14:53] pieterh	"Both broker and worker MUST send heartbeats at regular and agreed-upon intervals. A peer can consider the other peer "disconnected" if no heartbeat arrives within some multiple of that interval (usually 3-5)."
[14:53] guido_g	ok, i'll implement it that way
[14:53] pieterh	s/can/SHOULD/
[14:54] guido_g	MUST even
[14:54] guido_g	otherwise the whole thing would be useless
[14:54] guido_g	i'll start the hb timer after sending the ready message then
[14:55] pieterh	ack, MUST it is
[14:55] guido_g	fine
[14:55] pieterh	I need to reboot this box, it's starting to act funny
[14:55] pieterh	cy in a minute or two
[14:55] guido_g	np
[14:59] pieter_hintjens	back
[14:59] pieter_hintjens	what did I miss? :-)
[15:00] cremes	is your new laptop with the ssd giving you fits?
[15:01] pieter_hintjens	cremes: it all works... :-)
[15:01] pieter_hintjens	it's silent and uses a little less power, and it's faster
[15:01] guido_g	pieter_hintjens: why are you talking about missing x hbs? why not simply set the hb interval and barf if one is missed?
[15:02] pieter_hintjens	guido_g: bitter experience
[15:02] guido_g	uh
[15:02] guido_g	could you explain that a bit?
[15:02] pieter_hintjens	it's relatively easy to miss single HBs due to the two peers each sending out at intervals
[15:03] pieter_hintjens	it's also relatively easy to get them delayed when there's congestion
[15:03] pieter_hintjens	we did extensive testing of this for AMQP
[15:04] pieter_hintjens	you basically really do not want false positives
[15:04] guido_g	ah its ok
[15:04] pieter_hintjens	try it yourself, set the liveness to 1, run it in real use
[15:07] guido_g	so basically the hb interval is the max. time w/o a message divided by the liveness as you call it
[15:07] pieter_hintjens	yes
[15:07] pieter_hintjens	this is a simpleminded design, I'm sure we can do better over time
[15:08] guido_g	me too :)
[15:08] pieter_hintjens	here's one key aspect...
[15:08] pieter_hintjens	say the clock is 1 second
[15:08] pieter_hintjens	and a HB comes in at 1.001 seconds
[15:08] pieter_hintjens	and your liveness is 1
[15:08] pieter_hintjens	you now disconnect
[15:09] pieter_hintjens	so liveness must be at least 2 to allow for fractional delays in HBs
[15:09] guido_g	you have to draw the border somwwhere
[15:09] guido_g	*somewhere
[15:09] pieter_hintjens	what border?
[15:09] guido_g	connection alive or not
[15:09] pieter_hintjens	yes, how delayed can a HB be...
[15:09] pieter_hintjens	if it's 0.1% delayed, that's not a dead peer
[15:09] guido_g	infinite?
[15:10] pieter_hintjens	if it's infinite, you don't have heartbeating any more
[15:10] guido_g	by a femto second?
[15:10] pieter_hintjens	by a couple of heartbeats, is the best answer we found
[15:10] pieter_hintjens	it's like your own heart can miss one beat, two if you're really shocked, but three means you're dead
[15:11] guido_g	this is more a problem of how to define the border between alive and not
[15:12] pieter_hintjens	yes, and it's a heuristic you can't really make until you have real apps
[15:12] guido_g	you get the same problem when counting hbs
[15:12] pieter_hintjens	that's why it's configurable in my latest api
[15:12] guido_g	ok
[15:13] guido_g	i'll implement that as a simple counter decremented by a periodically called method
[15:13] guido_g	so one can adjust the interval and the count as he/she/it sees fit
[15:19] pieter_hintjens	kiss...
[15:19] guido_g	thanks :)
[15:19] pieter_hintjens	it just has to detect real failure
[15:19] pieter_hintjens	:-)
[15:20] guido_g	i need to read http://www.tldp.org/HOWTO/html_single/TCP-Keepalive-HOWTO/
[15:24] pieter_hintjens	guido_g: there may also be some threads on the websocket heartbeating discussions
[15:27] guido_g	first i'll implement the simple version
[15:32] sustrik	guido_g: tcp's keepalives in just a afterthought
[15:32] sustrik	even explictly discouraged by the spec
[15:32] guido_g	ah ok
[15:32] sustrik	if you want to read about sane heartbeating mechanism, have a look at SCTP heartbeats
[15:32] guido_g	one tab less open in the browser then :)
[15:33] guido_g	ok, will do, but later
[15:33] sustrik	just a tip :)
[16:00] ok2	hi!
[16:02] ok2	question, how can i see with zmq, on send() that the other end is not here anymore?
[16:02] ok2	zmq simply hangs for me, if the other side crashes
[16:05] cremes	ok2: you need to use zmq_poll(), zmq_send() with ZM_NOBLOCK and setup your own timeout logic
[16:06] cremes	there is some work being done to wrap up this pattern (it gets asked about a lot)
[16:06] cremes	but i have no idea what the status is on that work
[16:06] cremes	i've built this myself but using ruby instead of C
[16:08] ok2	it means that i make zmq_poll(x, NOBLOCK) and wait on zmq_poll for POLLOUT?
[16:08] ok2	zmq_send(x, NONBLOCK)
[16:09] ok2	no, on POLLERR ...
[16:20] cremes	ok2: no, that isn't correct
[16:21] cremes	let's say you are using a req/rep socket pair and you want to timeout on the request
[16:21] cremes	you would do something like:
[16:22] cremes	zmq_send(req_socket, message, ZM_NOBLOCK)
[16:22] cremes	register req_socket with zmq_poll()
[16:23] cremes	and POLLIN (you are looking for a reply, so that will be a recv operation)
[16:24] cremes	when req_socket returns POLLIN, call zmq_recv(req_socket, replymsg, 0)
[16:24] cremes	OR
[16:24] cremes	generate an application error if your timeout has expired before req_socket returns POLLIN
[16:24] cremes	make sense?
[16:39] pieter_hintjens	ok2: it's explained here: http://zguide.zeromq.org/page:all#toc68
[16:42] guido_g	pieter_hintjens: any chance that the frame order in spec:7 is wrong?
[16:42] pieter_hintjens	where exactly?
[16:42] guido_g	given that the broker is a XREQ type, it needs to prepend the target identity
[16:42] pieter_hintjens	guido_g: you know, I'm so pleased to see you implementing this
[16:43] guido_g	sarcasm?
[16:43] pieter_hintjens	no
[16:43] guido_g	puh
[16:43] guido_g	:)
[16:43] pieter_hintjens	seriously, it's magic to see someone implement a spec the day after it's published
[16:43] pieter_hintjens	what command are you looking at?
[16:44] guido_g	worker <-> broker
[16:44] guido_g	broker is XREQ
[16:44] pieter_hintjens	this is why all the framing says "on the wire"
[16:45] guido_g	pardon?
[16:45] pieter_hintjens	when you send a message to a router socket you always prepend the address
[16:45] pieter_hintjens	but that does not go on the wire
[16:45] pieter_hintjens	it's not sent, thus is not relevant to the protocol
[16:45] pieter_hintjens	however I'll explain this, it's confusing
[16:46] guido_g	it is
[16:53] pieter_hintjens	guido_g: ok, I've added a note, can you see if it's clear?
[16:53] pieter_hintjens	http://rfc.zeromq.org/spec:7#toc11
[16:55] guido_g	GREAT!
[17:01] guido_g	getting close now
[17:01] guido_g	*closer
[17:03] ok2	cremes and pieter_hintjens: thank you very much, i think i got what i need :-)
[17:03] pieter_hintjens	ok2, np
[17:05] guido_g	first success for the worker
[17:09] pieter_hintjens	broker is tricky to get right
[17:10] pieter_hintjens	am still finding undispatched messages in some cases
[17:11] guido_g	so, short break and then testing my worker against your broker
[17:12] pieter_hintjens	I'll need to commit, almost ready
[17:12] guido_g	ok, no hurry
[17:24] Altxp	Hello there.
[17:36] pieter_hintjens	guido_g: ok, done
[17:36] pieter_hintjens	the broker is, afaict, complete and robust
[17:40] pieter_hintjens	I've pushed this to the website, and committed all the source code
[17:41] djc	okay, I have some questions with my distro hat on
[17:41] djc	are there options to rely on external openpgm instead of the bundled one?
[17:41] djc	what versions of openpgm are compatible with zeromq-2.1.1?
[17:42] djc	does zeromq actually depend on the xmlParser, or can that be disableD?
[17:45] guido_g	pieter_hintjens: thanks
[17:45] guido_g	after some phineas and ferb i'll check it out
[17:49] private_meta	pieter_hintjens: so it's basically finished?
[17:52] guido_g	pieter_hintjens: simple hb works
[17:54] guido_g	ahh my worker crashes when the broker is restarted
[18:08] guido_g	ha! works now
[18:52] guido_g	pieter_hintjens: python client and worker are running against your broker
[19:21] yrashk	according to our latest fixes to erlzmq tests, ezmq is still way faster
[20:07] pieter_hintjens	guido_g: I'm back... that's great!
[20:07] pieter_hintjens	private_meta: yes, it's finished for now
[20:11] guido_g	pieter_hintjens: wb
[20:11] pieter_hintjens	guido_g: I added you as contributor to the spec, if that's ok
[20:12] guido_g	small thing, the mdbroker segfaults sometimes :)
[20:12] guido_g	thanks
[20:12] pieter_hintjens	can you get me a backtrace?
[20:12] pieter_hintjens	or reproducible case
[20:12] guido_g	i'll try my best
[20:12] travlr	pieter_hintjens: hey pieter, how would suggest I implement a mail list service for an open source community project i'm ready to start?
[20:12] pieter_hintjens	ulimit -u
[20:12] guido_g	it's related to a worker beeing killed
[20:12] pieter_hintjens	and then you'll get core files
[20:13] pieter_hintjens	yeah, cleaning up dead workers is tricky
[20:13] pieter_hintjens	travlr: hmm... I'd avoid email lists altogether, personally
[20:13] guido_g	are your build scripts compiling w/ debug on?
[20:13] pieter_hintjens	guido_g: should be, now
[20:13] pieter_hintjens	try 'c -C' for sure
[20:13] travlr	pieter_hintjens: what might be your suggestion then.
[20:13] pieter_hintjens	travlr: you might take one of the wikis at http://irongiant.wikidot.com
[20:14] pieter_hintjens	that works well for community projects
[20:14] pieter_hintjens	forum + email alerts + wiki combinations
[20:14] travlr	is that just using wikidot's forum widgets?
[20:14] pieter_hintjens	no, pages + comments in those ones
[20:15] travlr	ah, ok so its irongiant specific... let me look at the link, thank you.
[20:15] pieter_hintjens	email lists are ok but a bit useless for dynamic work
[20:15] pieter_hintjens	thus we all hang out here on irc...
[20:16] travlr	yes, i was hoping for forum, email notication integration
[20:16] pieter_hintjens	you got it all there
[20:16] pieter_hintjens	there is an open source project template afair
[20:16] travlr	k
[20:16] travlr	thanks
[20:16] pieter_hintjens	clickety-clone and it's done, very simple
[20:16] pieter_hintjens	np
[20:25] guido_g	pieter_hintjens: https://gist.github.com/853467 is not enough info i guess
[20:26] pieter_hintjens	guido_g: when did you pull the branch from git?
[20:26] pieter_hintjens	this is a bug I fixed afaik
[20:26] guido_g	ok, i'll check
[20:26] pieter_hintjens	also it's not debug, so I think the previous version...
[20:26] pieter_hintjens	I'll have to put a version number in the broker :-)
[20:30] guido_g	it's up-to-date when i do a pull
[20:30] pieter_hintjens	hmm
[20:31] pieter_hintjens	master branch...
[20:31] pieter_hintjens	ok, let me add a version number print...
[20:31] guido_g	https://gist.github.com/853483 <- output of run
[20:31] guido_g	trying to get more info out of gdb
[20:33] pieter_hintjens	can you grab the latest master, rebuild the broker?
[20:35] guido_g	grml the c script ignores the -g
[20:35] pieter_hintjens	how are you building?
[20:35] pieter_hintjens	'./build mdbroker' should work
[20:36] guido_g	the c script does not pass the -g to the compiler
[20:36] pieter_hintjens	nope, it doesn't
[20:36] pieter_hintjens	that's not how it works
[20:36] guido_g	??
[20:36] pieter_hintjens	sigh... it's a long story
[20:36] guido_g	build calls this script
[20:36] pieter_hintjens	can't you just './build all'?
[20:36] pieter_hintjens	ah hang on
[20:36] guido_g	sigh
[20:36] guido_g	yes, i can but w/o debug info
[20:36] pieter_hintjens	yes, but it doesn't pass a -g option
[20:37] guido_g	then the header of c should be changed
[20:37] guido_g	there is a -g mentioned
[20:37] pieter_hintjens	yes, that's wrong
[20:37] guido_g	double sigh
[20:38] pieter_hintjens	what git commit are you on?
[20:38] pieter_hintjens	git log -1
[20:39] guido_g	new clone and fresh broker (w/o debug info)
[20:39] guido_g	testing...
[20:39] pieter_hintjens	what git commit are you on?
[20:40] pieter_hintjens	the c script is ok, it doesn't pass -g to the compiler as such, it processes it
[20:40] pieter_hintjens	but all this is beside the point
[20:40] guido_g	git log -1
[20:40] guido_g	commit a8121994b0c25f4d242bbf06074dc6cc389ff336
[20:40] pieter_hintjens	ok
[20:40] pieter_hintjens	take hammer, break glass
[20:40] pieter_hintjens	export BOOM_MODEL=debug; build all
[20:41] djc	pieter_hintjens: if you have a moment later, could you take a peek at my questions from 3h ago?
[20:41] pieter_hintjens	djc: let me look...
[20:42] pieter_hintjens	ok... for openpgm, I don't know
[20:42] pieter_hintjens	for the XML dependency, it's there in some random main programs still being built with 0MQ
[20:42] pieter_hintjens	they resist being killed because they theoretically form part of the 'API' and thus are holy
[20:42] djc	ah yes, that is how it goes
[20:42] pieter_hintjens	guido_g: the build script _should_ be making debug builds
[20:43] guido_g	it does not
[20:43] djc	so for openpgm I should probably stick to .92 and wait for you guys to test something newer?
[20:43] pieter_hintjens	djc: you need to ask steve-o
[20:43] djc	pieter_hintjens: okay, thanks
[20:43] djc	does he come around here, and if so, in what tz?
[20:43] pieter_hintjens	guido_g: ok, can you add 'c -C' to build just before the first if statement?
[20:44] pieter_hintjens	that'll produce a report of the actual C compiler syntax it uses
[20:44] guido_g	w/ debug info https://gist.github.com/853467
[20:44] guido_g	hehe
[20:44] guido_g	i did the export hingie and it seemed to work
[20:44] guido_g	*thingie
[20:45] pieter_hintjens	which the dratted build script is meant to do itself if necessary
[20:45] pieter_hintjens	one line of shell, how complex can it be... :-(
[20:46] pieter_hintjens	ok, looking at that backtrace, thanks
[20:46] guido_g	this happens when the (only) worker exits w/o disconnect (just disapears)
[20:47] guido_g	oops
[20:47] guido_g	tested this w/ my worker
[20:47] guido_g	will try the c worker now
[20:48] pieter_hintjens	if your worker makes it crash, send that to me
[20:48] pieter_hintjens	I can't get the crash using the C worker
[20:48] guido_g	i can :)
[20:49] guido_g	you stopped the worker w/ ctrl-c?
[20:49] pieter_hintjens	yes
[20:49] guido_g	tz tz tz
[20:49] pieter_hintjens	kill -9?
[20:49] pieter_hintjens	should not make any difference
[20:49] guido_g	it looks like the woker sends a disconnect on sigint
[20:50] pieter_hintjens	nope, don't think so...
[20:50] guido_g	ok
[20:50] pieter_hintjens	well, valgrind is also real good for this
[20:51] pieter_hintjens	valgrind --tool=memcheck mdbroker
[20:51] guido_g	crash when killing the worker w/ ctrl-c too
[20:51] pieter_hintjens	hmm, can't be a timing issue, it's all one thread
[20:51] pieter_hintjens	are you also running a client?
[20:51] guido_g	bt looks like its the same place
[20:52] guido_g	no, client has been disconnected before
[20:52] pieter_hintjens	but do you start it in the test case?
[20:52] guido_g	i've a one shot client that sends one request
[20:52] pieter_hintjens	this makes a difference
[20:52] pieter_hintjens	yes, I got the crash now
[20:53] pieter_hintjens	need to have a client request
[20:53] guido_g	even a processed one?
[20:53] pieter_hintjens	it all affects the broker state
[20:53] guido_g	obviously
[20:54] pieter_hintjens	ok, I got the error, but will look at it tomorrow
[20:54] pieter_hintjens	i appreciate the feedback enormously, guido_g
[20:54] guido_g	Ã¸mq is like a time machine, didn't use gdb for 10 years
[20:54] pieter_hintjens	heh
[20:57] guido_g	any idea where to put the python version of mdp?
[20:58] guido_g	i'll make a repo for it
[21:05] pieter_hintjens	guido_g: I'd hope for Python translations that can be used for the Guide
[21:06] pieter_hintjens	these are minimal implementations, we can make a real product & repository after
[21:06] guido_g	sure, but there not translations of the c code, as i said
[21:06] guido_g	this a mdp implementation using the async features of pyzmq
[21:07] pieter_hintjens	that sounds fine
[21:07] pieter_hintjens	the point isn't to write C in Python
[21:07] pieter_hintjens	the point is to show Python programmers how to make the example in question
[21:07] pieter_hintjens	so, client API, worker API, broker
[21:07] pieter_hintjens	using whatever the language offers
[21:07] guido_g	https://gist.github.com/853561 <- the test worker
[21:09] pieter_hintjens	did you read mdwrkapi.c?
[21:09] pieter_hintjens	and mdworker.c
[21:09] guido_g	no
[21:09] pieter_hintjens	probably worth doing
[21:09] guido_g	why?
[21:09] pieter_hintjens	simply to see if we can offer a consistent API
[21:10] pieter_hintjens	worker API in C or Python can look similar
[21:10] guido_g	pieter_hintjens> the point isn't to write C in Python <- ...
[21:10] pieter_hintjens	the API I made is not C
[21:10] pieter_hintjens	it's implemented in C
[21:10] pieter_hintjens	I'd thought of specifying the API as a rfc
[21:10] pieter_hintjens	both APIs
[21:11] pieter_hintjens	for example, in mdworker.c there is no 0MQ context, it's invisible
[21:12] pieter_hintjens	see https://gist.github.com/853581
[21:12] pieter_hintjens	and yes it includes a .c, please don't complain... :-)
[21:15] guido_g	https://github.com/guidog/pyzmq-mdp <- first go
[21:16] guido_g	see, the api is completely different because of the mechanisms used
[21:16] guido_g	as i said, i want to make use of the async features built into pyzmq
[21:17] pieter_hintjens	for sure
[21:18] pieter_hintjens	though I've no idea what the value is in offering such a complex API to app developers...
[21:18] guido_g	complex?
[21:18] pieter_hintjens	look, my API has three methods
[21:18] pieter_hintjens	create, send/recv, destroy
[21:19] pieter_hintjens	yours exposes the internals of MDP
[21:19] pieter_hintjens	every timeout, every message
[21:19] pieter_hintjens	it's not a wrapper at all, just a deconfobulatorix
[21:19] guido_g	the create is line 31 in the example worker
[21:19] pieter_hintjens	i made that up, btw
[21:19] pieter_hintjens	sorry :-) it's late
[21:19] guido_g	the worker method is the on_request
[21:20] guido_g	and the reply is sent by self.reply
[21:20] pieter_hintjens	i'm not complaining, this is fantastic work
[21:20] guido_g	very complicated, i've to admit
[21:20] pieter_hintjens	please do look at that gist I posted just now
[21:20] pieter_hintjens	that's what the app developer sees
[21:20] pieter_hintjens	when my C code is shorter than your Python code, to do the same work... that's... well...
[21:21] pieter_hintjens	ok
[21:21] pieter_hintjens	projects...
[21:21] pieter_hintjens	MDP is the protocol, not software
[21:21] guido_g	shorter?
[21:21] pieter_hintjens	do you see value in a project with mixed pieces, e.g. brokers / apis in different languages
[21:21] pieter_hintjens	https://gist.github.com/853581
[21:21] guido_g	yes
[21:22] pieter_hintjens	that's the echo worker
[21:22] pieter_hintjens	ok, let'
[21:22] pieter_hintjens	let's invent a name
[21:22] guido_g	i might be shocked, but can still read
[21:22] pieter_hintjens	majordomo already been taken by some random FOSS project apparently
[21:23] guido_g	random... cough
[21:23] pieter_hintjens	do you prefer memorable or meaningful?
[21:23] guido_g	the interoperability is there
[21:23] pieter_hintjens	yes
[21:23] guido_g	i'm using my worker w/ your broker and and client
[21:23] pieter_hintjens	interop is a funny thing though, it operates at multiple levels
[21:23] pieter_hintjens	e.g. 0mq api is interop
[21:24] Guthur	pieter_hintjens, what's the link to this protocol?
[21:24] pieter_hintjens	zero.mq/md
[21:24] guido_g	http://rfc.zeromq.org/spec:7
[21:24] Guthur	cheers
[21:24] pieter_hintjens	:-)
[21:24] Guthur	did you find that majordomo is trademarked?
[21:25] pieter_hintjens	nope, it's not afaics
[21:25] guido_g	but still, the name is known for a quite long time
[21:25] pieter_hintjens	but it's a bad idea to mix the pattern - protocol - implementation
[21:26] pieter_hintjens	oh god, a bank email hit the list
[21:26] pieter_hintjens	you can tell them by the 1000-word disclaimers
[21:27] pieter_hintjens	And someone's acidic reply, "It took me a moment to realize your ridiculous disclaimer is not the message itself..."
[21:27] pieter_hintjens	hehe
[21:27] guido_g	hehe
[21:29] Guthur	it then has another link to even more disclaimer talk
[21:30] pieter_hintjens	I think I just sold my soul to BoA by replying to that email
[21:31] pieter_hintjens	ok, guido, let's make a project called Popcorn
[21:31] pieter_hintjens	cause I'm finishing a large bowl I made a while ago
[21:31] pieter_hintjens	there are dozens of FOSS projects called popcorn so there's no chance of confusion
[21:31] guido_g	there was a song called like that... back when music came these black large discs...
[21:32] pieter_hintjens	8" floppies?
[21:32] guido_g	no, harder
[21:32] pieter_hintjens	12" removable winchesters?
[21:32] pieter_hintjens	did they have MP3s in those days? don't think so...
[21:32] guido_g	but drive was open mostly
[21:32] pieter_hintjens	you're making this up as you're going along!
[21:33] pieter_hintjens	so, guido_g, shall I make this as an official zeromq community project?
[21:33] guido_g	let it cook a bit more first
[21:33] pieter_hintjens	embrace the chaos!
[21:33] pieter_hintjens	it burns if you leave it cooking too long
[21:34] guido_g	if we have full interop w/ the c version, then it might be worth to be announced
[21:34] guido_g	for this it needs a broker
[21:34] pieter_hintjens	ah, but the trick with projects is to announce a stunning goal way before you can make it happen
[21:34] Guthur	what is this new project
[21:34] pieter_hintjens	guido_g: see, we have interest already...
[21:34] pieter_hintjens	Guthur: sorry, can't tell you, it's confidential
[21:34] guido_g	sigh
[21:34] pieter_hintjens	:-)
[21:35] Guthur	I could try the logs hehe
[21:35] Guthur	they are quite long though
[21:35] guido_g	https://github.com/guidog/pyzmq-mdp
[21:35] Guthur	you guys were chatting for ages, hehe
[21:35] pieter_hintjens	guido_g: cook it a while, anyhow
[21:36] pieter_hintjens	if it was my project, I'd insist on common APIs in all languages
[21:36] guido_g	at the least the broker should be there and working with your worker and client
[21:36] pieter_hintjens	yes
[21:36] pieter_hintjens	defining good, language neutral APIs is quite fun
[21:36] pieter_hintjens	and extraordinarily useful to app developers
[21:36] guido_g	pieter_hintjens: then i'd remove the repo immediately
[21:37] pieterh	i can make event driven APIs in C, that's not the issue
[21:37] pieterh	the issue is just the semantics you show to users, and whether these can be consistent between languages
[21:38] pieterh	I'm pretty adamant they can be consistent
[21:38] guido_g	there is more than language
[21:38] pieterh	guido_g: are you in London by any chance?
[21:38] guido_g	nope
[21:38] pieterh	sigh...
[21:39] guido_g	too expensive just for a beer
[21:39] pieterh	I'd actually take a train somewhere to thrash this out
[21:39] guido_g	wouldn't make it cheaper
[21:40] pieterh	ok, let's cook this a while.
[21:41] pieterh	I need to get onto the rest of Ch4 anyhow
[21:41] pieterh	there is a lot still to do
[21:41] pieterh	I'll fix that broker crash tomorrow when I find an hour, it's a busy day
[21:42] guido_g	fine with me
[21:42] pieterh	ok, gnite !
[21:43] guido_g	good night!
[21:55] guido_g	has been a long day, need some sleep
[21:55] guido_g	cya
[22:32] Guthur	pieterh, umm did the implementation of zhelpers.h not use to be in the guide text
[22:33] pieterh	Guthur: (a) it got very long and (b) the translation model only works for .c files so it was showing C code in the PHP/whatever versions
[22:33] Guthur	ok
[22:34] pieterh	Did you like having it in the guide?
[22:34] Guthur	doesn't matter too much, I just wanted to check the dump implementation
[22:35] pieterh	I think the Guide is moving towards being less focused on C
[22:35] Guthur	true
[22:35] pieterh	Though I think C is the future... but still...
[22:35] pieterh	:-)
[22:35] Guthur	I thought C was the past, hehe
[22:35] Guthur	Lisp is the future, hehe
[22:36] pieterh	Lisp is older than C afair
[22:37] Guthur	that was LISP
[22:37] Guthur	but true, Lisps ancestor is the second oldest high level language
[22:38] Guthur	after Fortran
[22:38] Guthur	First with a Garbage Collector as well
[22:38] Guthur	afaik
[22:39] pieterh	gawk rules
[22:39] Guthur	Lisp is a very enlightening experience as they say
[22:40] Guthur	Javascript is pretty awesome as well, it's just a pity it always lives in a browser, hehe
[22:46] Guthur	pieterh, have you ever tried Forth? last language mention, I swear
[22:47] sp4ke	hi guys
[22:47] pieterh	Guthur: I used to build Forth machines in 6502 assembler, quite freaky
[22:47] pieterh	hi sp4ke
[22:47] Guthur	pieterh, cool, 6502 was one of my first programming experiences
[22:47] pieterh	The sign of a Real Programmer
[22:47] Guthur	BBC micros
[22:48] sp4ke	want just to post this link for people who want to use zmq with MSVC 2010 i made a tutorial after struggling to make it work
[22:48] sp4ke	http://www.mansysadmin.com/2011/03/using-zeromq-framework-with-visual-studio-2010-tutorial/
[22:48] sp4ke	it could be useful on irc archive
[22:49] pieterh	sp4ke: ... it's ...
[22:49] pieterh	it comes down to "VS2010 made some changes to output properties, go to projects properties -> General -> , then change the Output Directory field to"..\..\..\lib\" (the same path as in (Linker -> General -> Output File) property"
[22:49] pieterh	why not add that to http://zero.mq/tips
[22:49] sp4ke	yes indeed but for a new comer to MSVC it is not obvious
[22:50] sp4ke	ok i will do
[22:50] sp4ke	did not know that link
[22:50] pieterh	lol
[22:52] Guthur	it always works without any bother for me
[22:52] Guthur	I convert and compile
[22:53] Guthur	maybe I'm just lucky, hehe
[22:53] pieterh	yeah, but he made a screenshot
[23:08] Guthur	pieterh, the identity.c is setting the identity after connecting
[23:09] Guthur	no sorry
[23:09] Guthur	my bad
[23:09] Guthur	tired eyes
[23:09] pieterh	yeah, it does it twice
[23:10] pieterh	get some sleep, man!
[23:10] Guthur	soon, just want to get this dump working right
[23:11] pieterh	:-) well, I fixed the Last Bug in mdbroker, am now crashing...