Tuesday June 7, 2011

[Time] NameMessage
[01:15] Seta00 mikko, have you seen this?
[01:15] Seta00 I think that would couple extremely well with your Lecture Recorder
[01:39] yukonbob hello, #zeromq
[02:35] Seta00 hello yukonbob
[02:49] yukonbob :)
[02:49] yukonbob doesn't seem to active in here in my visits over last few days... people here mostly knowledgable and trouble-free?
[03:01] michelp yukonbob, the activity comes in bursts
[03:02] michelp stick around
[03:02] yukonbob am just getting started w/ 0mq -- never got into it all this time, but last couple days digging in... *very* fun.
[03:03] michelp yeah same here, i've been working with it for a couple weeks. I'll never write another server again, i'll just use 0mq ;)
[03:05] michelp have you read the guide?
[03:05] yukonbob michelp: have not, should.
[03:05] michelp there are no public services that i know about, you can easily create your own service and connect to it
[03:05] yukonbob I've just been browsing the API and examples.
[03:05] michelp the guide has numerous examples
[03:05] michelp the man pages are also very good
[03:05] yukonbob michelp: I'd like a third party service, so I'm not working my own own potential mistakes...
[03:06] michelp well i've never heard of one. i don't think you'll have much option but to run your own services
[03:06] michelp or run an existing framework like mongrel
[03:06] michelp and connect to it with your code
[03:07] michelp
[03:07] yukonbob 0mq deemed suitable for transporting over internet, or is dumb idea?
[03:07] michelp there's a wiki page on that subject
[03:08] michelp don't remember exactly where it is
[03:08] michelp 0mq is not more secure than tcp, you have to build in your own security
[03:08] michelp how hard you work at that is up to you
[03:08] michelp sure
[03:08] michelp i have a little time
[04:06] yukonbob hehe: We'll generate random values, just like the real weather stations do.
[04:06] Toba weather stations = number stations?
[04:07] michelp the guide is good :)
[05:42] CIA-31 pyzmq: 03MinRK 07master * r3435e19 10/ zmq/eventloop/ : Don't exit loop on EINTR ...
[06:17] CIA-31 pyzmq: 03MinRK 07master * rcc52204 10/ (15 files in 3 dirs): rename czmq.pxd to libzmq.pxd, to avoid confusion with czmq project (high-level C API) -
[08:00] pieter_hintjens hi folks
[08:01] aurojit pieter_hintjens: hi
[08:01] eintr top of the morning, gentlemen
[08:01] pieterh hi eintr, aurojit
[08:31] eintr this would make for helluva titanic: ... makes me want to build something weird and awesome.
[08:37] pieterh eintr: looks like it'd be a great fit with 0MQ
[08:39] eintr pieterh: yeah, i think so. although overkill if all you need is logging / durability. IF you add something that would actually work on *logged* messages however, then it should be nifty.
[08:40] eintr "which of N workers processed message M within timeframe T" style questions come to mind.
[08:40] pieterh eintr: mixed with something like Titanic, it could be neat
[08:40] eintr mm, distributed brokerage :)
[08:47] mikko Seta00: haven't
[08:47] mikko Seta00: seems to be inactive
[09:59] masterzen Hi. I'm watching Zed Shaw presentation right now, and the first thing he said was: "don't put 0mq on the Internet". Is this still true? Can't we use 0mq to build remote clients/server stuff where we don't control where clients run?
[10:05] mikko let's say that it's getting better
[10:10] masterzen mikko: thanks, in what version does it get better (ie 2.1 or in the 3.0 branch)?
[10:10] mikko 2.1 is the current stable branch
[10:10] mikko 3.0 is going through a lot of changes
[10:16] pieterh masterzen: there are some places 0MQ will die if fed bad data, but that's not the real issue with Internet use
[10:17] pieterh the more profound issue is one of security, IMO
[10:21] mikko it's currently relatively easy to OOM zeromq program
[10:21] mikko if you know what you are doing
[11:00] masterzen pieterh: ok thanks, can you elaborate about the security?
[12:09] sunoano no built-in transport layer encryption ie you have to take care of that yourself eg using VPN
[12:25] private_meta ok, so compiling the same program on redhat ruins my messages, compiling/running it on ubuntu works
[12:25] private_meta I'm looking and looking and can't figure out what actually garbles my messages
[12:26] private_meta It's like my program rolls a 1d20 and everything but a natural 20 breaks the program
[12:28] mikko have you tried compiling without opitmisations?
[12:28] private_meta sure
[12:28] mikko which version of g++ ?
[12:28] private_meta works on 4.4 on ubuntu, doesn't work on 4.1.2 and 4.4 on redhat
[12:29] private_meta but i'll try again with different optimizations
[12:29] mikko valgrind telling anything?
[12:31] private_meta valgrind doesn't really help there
[12:34] private_meta ah ok
[12:34] private_meta fixed something with compiling with debug mode on the server
[12:35] private_meta so now, SOME of the messages come through
[12:35] private_meta at least the same amount as with ubuntu
[12:36] private_meta I assume it still garbles the >128Byte Messages as it does with Ubuntu though... can't test it right now
[12:36] private_meta Any idea why the messages would be garbled with debugging disabled or optimization enabled?
[12:41] jsimmons because you're ruining the memory backing them and changing the compile modes changes how it's laid out in memory
[12:42] jsimmons so... valgrind!
[12:45] pieterh masterzen: sorry, went to lunch
[12:45] pieterh you still around?
[12:45] masterzen pieterh: yes
[12:45] pieterh well, any unsecured TCP activity over Internet is problematic
[12:45] pieterh it's ok for public data, read only
[12:45] masterzen pieterh: I understand
[12:45] pieterh anything else, not so much
[12:46] masterzen pieterh: it's still possible to encapsulate the data in my own encryption/security layer...
[12:46] pieterh masterzen: yes, indeed
[12:46] pieterh e.g. what Salt does over 0MQ
[12:47] private_meta JStoker: would be easier if I knew how to use valgrind that way... Tried looking for tutorials that actually helped and didn't find anything
[12:49] private_meta erm
[12:59] private_meta k, i meant jsimmons
[13:15] Seta00 masterzen, I'm currently using 0MQ on the internet, for audio streaming, with full end-to-end encryption
[13:27] masterzen Seta00: it's interesting. Did you encounter any issues?
[13:29] Seta00 masterzen, except for the gigantic lack of compatibility between different libraries, no :P
[13:29] Seta00 in other words, 0MQ gave me the less amount of issues
[13:30] Seta00 gave me less issues*
[13:30] so_solid_moo fewer issues ;)
[13:30] Seta00 yeah!
[13:30] masterzen Seta00: ok, thanks. I'll play with 0mq to give it its chance :)
[14:15] private_meta Ahahahaha... Running my program with valgrind makes it work, running it WITHOUT valgrind makes it break
[14:15] private_meta ...
[14:23] private_meta This is beyond frustrating
[15:13] pieterh private_meta: is it a multithreaded program?
[15:14] private_meta yes
[15:16] private_meta pieterh: the receiving is handled by a single thread, the sending can be done by any thread I think
[15:16] pieterh I'd suspect timing issues
[15:16] pieterh it's the only thing I know valgrind affects
[15:17] private_meta pieterh: just added and testing one more thing. If I actually DUMP the message somewhere (to string, file) before sending, it appears to work in this case
[15:18] pieterh sure, same thing
[15:18] pieterh it slows down one of the threads
[15:18] pieterh the way to fix this is to ignore the case where it works
[15:18] pieterh don't try to figure out why it works, that's unhelpful
[15:18] private_meta sure, but working cases are important to narrow down what's making it fail
[15:19] pieterh well, not really ime
[15:19] pieterh this is a classic problem solving approach
[15:19] pieterh ignore the "it works" cases and especially trying to compare the two
[15:20] pieterh what you have is a timing dependency between connects, binds, or messages
[15:20] pieterh under valgrind, or if you add debug output, the dependency shows, and causes your app to break
[15:21] pieterh so keep that breakage happening, and slice up the code until you find the culprit
[15:23] private_meta I have no idea how to do those last two lines you told me
[15:24] pieterh ok... without knowing anything about your application...
[15:24] private_meta As -) I have no idea what makes it break, -) I can't really 'slice up the code', -) I don't have half a clue about valgrind and as I stated before, there aren't any tutorials that are helpful for me, -) not quite sure what you mean with the dependencies
[15:24] pieterh you want to see your code misbehaving
[15:25] pieterh whatever you mean by "it breaks"
[15:25] private_meta yeah, call it misbehaving, works for me
[15:25] pieterh that means not running under valgrind, not doing debug output (sorry, i was getting this the wrong way around)
[15:26] pieterh now you have it breaking, you remove pieces of the code
[15:26] pieterh that's the simplest approach, chop pieces off it
[15:26] pieterh at some point you'll see it suddenly work
[15:27] pieterh you may be able to figure it out simply by studying the code
[15:27] private_meta That's my problem right there
[15:27] pieterh once you realize that the fault is caused by something happening faster than you expect, somewhere
[15:27] private_meta from the situation I have right now
[15:28] private_meta Removing code means throwing out 90% of the code required right now, and there is no intermediary version between this and the simple "let's send a message" case
[15:29] pieterh ok, then, you might try to narrow down which message flow is "going too fast"
[15:29] private_meta at least none that really works that helps me
[15:29] pieterh by inserting pauses at different places
[15:32] private_meta Before plucking it apart, there's something that bugs me. The program flow is set up so the entire connection has to be open for the program to proceed and send the message that fails. I even have to add a "sleep" after initiating the connection or it fails instantly. I just can't quite grasp there WOULD be such a timing issue
[15:34] private_meta I mean, it's the first message sent that fails. there's a second delay from connection initiation to first message send, which makes me assume it's between the function call where I say "send this string" and the actual creation of a zmsg and it's sending, right now I just can't see why a timing issue can be so relevant there, as in the initial case, the entire thing is pretty "linear", even if it's in a separate thread
[15:34] private_meta well... I'll try narrowing it down, let's see if i can at least do that any further
[15:38] pieterh you know that a connect takes ~100msec to happen, in the background, right?
[15:41] private_meta pieterh: color me ignorant there for not knowing, but I thought messages sent if the connection is not established are, i don't know, kept back til the it's open, and sent then
[15:41] pieterh it depends on the socket type
[15:42] pieterh also if you're connecting or binding
[15:42] pieterh what is the socket type?
[15:42] pieterh and who is doing the bind vs. connect?
[15:44] mikko i am
[15:44] mikko j/k
[15:44] mikko pieterh: how are you?
[15:44] pieterh mikko: hey, nice to see you around, I'm fine
[15:44] pieterh just got back from two weeks in the rainy west coast US
[15:45] mikko it wasn't rainy when i was there
[15:45] mikko well, it was rainy at SFO
[15:45] pieterh yeah, precisely :)
[15:45] mikko san diego was sunny
[15:45] pieterh well, that's semi-desert
[15:46] pieterh we had a good meetup in portland, and a good one in SFO
[15:50] private_meta pieterh: bind is the one that receives, it receives ok, the one that does the connect is the one that fails (it has 1 sec delay after doing th econnect)
[15:51] pieterh private_meta: what socket types are you using at bind and at connect?
[15:51] pieterh and how do you define 'fails'?
[15:53] private_meta well... that's one thing I forgot to mention. it fails this way: it creates the multipart message pretty much correctly, but the "payload", the part where I actually have the content in, instead of the string it looks like a pointer address, like "7593151fa9c" (random), but as I said, only in those cases.
[15:53] private_meta about the types
[15:53] pieterh ah...
[15:53] pieterh hang on
[15:53] pieterh :-) next time, it'
[15:53] private_meta AND
[15:53] pieterh it'd be worth explaining what you mean by 'fail'
[15:53] private_meta in addition to that
[15:54] private_meta in case this is also important:
[15:54] private_meta the message to be sent is already corrupted like this BEFORE it is sent
[15:54] private_meta when it is created
[15:54] private_meta but as I said, only in the cases explained
[15:54] pieterh private_meta: hang on, we have enough data
[15:54] private_meta ok
[15:54] private_meta I really should have explained that sooner apparently
[15:55] pieterh your sender is presumably destroying the message before its actually sent
[15:55] pieterh what language are you working in?
[15:55] private_meta C++
[15:55] pieterh ok, I've no idea what destructor or deallocator you're using, but...
[15:55] pieterh this is a classic error
[15:56] pieterh zmq_send() happens in the background
[15:56] pieterh meanwhile your main thread destroys or frees the memory the message is in
[15:56] pieterh result: 0MQ sends garbage
[15:56] pieterh you can send $0.50 via paypal to the usual address
[15:58] private_meta Following that logic, may I make an assumption?
[15:58] pieterh of course
[15:58] private_meta The larger the message to send is, the longer it takes to prepare/send, so the larger the chance for it to be erased and for the send to "fail".
[15:58] pieterh nope
[15:59] pieterh well, maybe
[15:59] pieterh sending is 1 system call, independent of message size
[15:59] pieterh if you send many small messages they may be batched into 1 system call
[15:59] private_meta The thing is, your option does not explain away the other behavior, that after a length of 128 characters, it fails CONSISTENLY, no matter what I do
[15:59] pieterh sure
[15:59] private_meta at least according to what I found out
[15:59] pieterh fix the deallocation issue
[16:00] private_meta hmm
[16:01] private_meta I already DO a memcpy into a char pointer before giving the string to zmsg, so there shouldn't be any deallocation from anything but zmsg
[16:02] pieterh well, can you post a fragment of the sending code?
[16:02] pieterh you should be able to make a minimal test case, if it's really failing at send time
[16:03] private_meta "minimal" would still contain 10-15 files
[16:03] private_meta any kind of minimal that still resembles anything that I'm actually using
[16:10] private_meta pieterh: the kind of non-working minimal is this part 1 is the part of the code that creates the message and sends it, the rest should be in a way explanatory, but as said, it might be a bit too minimal
[16:11] pieterh so here's the problem
[16:11] pieterh char * msg = new char[size];   memcpy(msg, message.data_nb(), size);
[16:11] pieterh I mean, seriously...?
[16:11] pieterh price just went up to $1.50
[16:12] pieterh when is 'msg' destroyed? it's local to the function, no?
[16:13] private_meta ah... good you brought that up, memory leak, msg is not destroyed
[16:14] private_meta as the message can't really destroy itself after sending itself, i clear it out after it's sent, but it's not destroyed
[16:14] mikko private_meta: are you sending just ascii strings?
[16:14] private_meta mikko: pretty much, yeah
[16:15] mikko private_meta: line 5 would mess up if you had null characters in the ()
[16:15] pieterh ok, so new is persistent... sorry, false alarm
[16:15] mikko new allocates from heap
[16:16] private_meta well, the zmsg, in this buggy case, lives until the program dies
[16:16] pieterh private_meta: and how about 'zmq::message_t message;' in send()?
[16:16] private_meta sec
[16:16] pieterh that's not persistent
[16:16] pieterh line 46
[16:16] private_meta no, it's not
[16:16] pieterh you're sending from stack
[16:16] pieterh as soon as the function exits, the stack unrolls
[16:16] pieterh you get... hmm, stuff that looks like pointers on it
[16:17] pieterh do not send messages from stack variables
[16:18] private_meta Ok, so I new the message_t variable, but I'd still have to delete it, and to be consistent I'd still have to do it at the end of each loop iteration. What does it make different there? It still exists as long as the other
[16:19] pieterh shrug, the correct way is to allow 0MQ to deallocate the buffer
[16:20] pieterh do you see that sending message_t is probably the cause of (some of your) problems?
[16:20] mikko pieterh: that construct should be fine
[16:20] mikko it's similar to creating zmq_msg_t in stack
[16:20] mikko and sending it
[16:21] pieterh mikko: it's a 0MQ message type? and properly initialized?
[16:21] pieterh hmm, ok, false alarm #2
[16:21] pieterh private_meta: I'm still going to go with "sending stuff that isn't there anymore by the time send() gets around to it"
[16:21] mikko pieterh: when the object goes out of scope zmq_msg_close is called
[16:22] mikko pieterh: but if the message is still in-flight the refcount should be >0
[16:22] pieterh private_meta: there's a magic cut-off at 50 bytes, smaller than that sits in the structure, larger is an allocation
[16:23] pieterh mikko: I'm not familiar with the C++ API, tbh, can only guess based on the symptoms
[16:24] private_meta mikko: I built it assuming that behavior, but doing that, how could I avoid behavior described by pieterh, should that be the case?
[16:24] private_meta pieterh: well, at this moment, the payload (the part of the multipart message that contains the information, has 78 bytes
[16:25] pieterh private_meta: hmm, can you try to chop this down to a test case, one sender and one receiver?
[16:25] pieterh sorry, I know that's a detour but it may be the only way to solve this
[16:25] private_meta Define one sender, one receiver... it practically IS like that right now
[16:25] private_meta I'm only testing with one sender/one receiver right now
[16:29] private_meta I've got somewhat of an onion here, basic send, zmsg around that, modified majordomo around that, project-bound client-server classes around that
[16:29] private_meta pieterh: so I'm not sure to what level you intend me trying to strip it down
[16:30] pieterh private_meta: stripped down means any lines of code that can go must go
[16:31] pieterh otherwise other people can't reasonably understand the case
[16:31] pieterh so you'd at least see if problem was in project classes, MDP classes, or zmsg class
[16:32] private_meta I'll try to strip it down to mdp only
[16:33] pieterh ack
[16:51] private_meta ok, it's more difficult that I thought
[16:52] private_meta pieterh: I already have to use threads to use the modified majordomo system as is
[16:52] pieterh ok, so instead of stripping it down...
[16:53] pieterh you have a specific case where one sender's messages are corrupted
[16:53] private_meta well, I'll still try to make the majordomo only version work, with a single thread, I'll try
[16:54] private_meta ugh
[16:54] private_meta even more difficult
[16:57] private_meta I can't strip it apart and send/receive a message normally at the same time
[17:03] michelp morning
[17:05] private_meta michelp: west coast?
[17:09] michelp private_meta, yeap
[17:10] private_meta gotta finally remember it, +9, east coast
[17:10] private_meta michelp: ah, so you do have Valve Time
[17:10] michelp valve time?
[17:11] private_meta Well, Valve Corporation, the gaming company, it's their official time zone, their new day for cheap games always starts at ~10.00 east coast time
[17:12] michelp ah i see
[17:12] private_meta sorry for destroying your mental image
[17:12] michelp nah no problem it reminds me i gotta replace my fuel injection pump :)
[17:12] private_meta ah, cars are too expensive for me
[17:13] private_meta Well, not in a "Can't afford" way, more in a "don't want to afford" way
[17:13] michelp that's why i only drive clunkers
[17:14] michelp currently rockin' an 84 diesel mazda pickup truck. cost me $1200 :)
[17:14] private_meta insurance and tax for that kind of car would be quite expensive here
[17:14] michelp where is there?
[17:14] private_meta austria
[17:14] michelp ah nice
[17:14] michelp i've been to vienna, it was pretty awesome
[17:15] michelp did a conference there at a huge hall where apparently mozart built the sets for the magic flute
[17:15] michelp iirc it was adjacent to the opera house or something like that
[17:16] private_meta tbh I haven't ingested much of the culture of vienna the couple of times I've been there
[17:16] michelp i did spend a few days in the austrian countryside too, it's amazing what a difference it is from city to country out there
[17:17] michelp i guess that's the case everywhere
[17:17] private_meta good or bad?
[17:17] private_meta or just different?
[17:17] michelp just different, i like the countryside, but the city obviously had more fun stuff going on
[17:17] private_meta sure has...
[17:17] michelp travelling on the train through the rolling hills was nice
[17:18] private_meta but I guess one of the differences of the countryside in austria compared to the US is that in Austrias countryside, you still have mobile reception ;)
[17:18] michelp yeah ;) it's getting better here, although my state is still only about %50 covered
[17:18] michelp i live in a big, less populous state so there's no incentive for them to set up towers in the vast wilderness that is most of Oregon
[17:20] private_meta It's kinda hard to swallow that almost any US state is larger in size than austria
[17:20] private_meta -any+every
[17:20] michelp well, the west coast ones at least
[17:20] michelp a lot of the eastern states are pretty small
[17:21] private_meta Everything that's lager than Maine is larger than Austria
[17:21] private_meta In other words, Austria is about 4k square kilometers larger than Maine
[17:22] michelp hmm wikipedia indicates the other way around
[17:22] michelp Maine is a little bigger, maybe i'm reading this wrong
[17:22] private_meta Maine 79,931 m², Austria 83,855 m²
[17:22] michelp maine: 91,646 km2 austria: 83,855 km2
[17:22] private_meta ah... + water is 91k
[17:23] michelp oh ok i get it
[17:23] michelp i didn't realize they made a distinction with the water :)
[17:23] private_meta Now we both do, apparently
[17:24] michelp well Oregon is well over twice that at 255K, and only 3.5M population, the vast majority of that in the one big city Portland
[17:24] michelp so the density for most of the state is extremely low
[17:25] private_meta more space for wolves and bears!
[17:25] michelp Malheur country for example is 25,610 km2 with only 31K people
[17:26] michelp never seen a wolf, yet. plenty of bears though.
[17:26] michelp the wolves are starting to repopulate, so it's only a matter of time
[17:26] private_meta Everything that manages to scare Stephen Colbert must be awesome
[17:27] private_meta It's funny how some areas are named all over the world... Malheur County makes it sound like it was an accident
[17:27] michelp yeah, doesn't it mean Bad Odor? or something like that?
[17:27] private_meta Well
[17:27] michelp Bad air?
[17:27] michelp Bad time?
[17:28] private_meta In french and german, "malheur" means "mishap"
[17:28] michelp ah
[17:28] private_meta Well, then again, Austria has a town called "Fucking"
[17:28] michelp well it's something like 90% high altitude desert so I can see how the first people who got there figured they took a wrong turn
[17:29] private_meta Somewhat hilarious, how many times the sign for that town has been stolen already
[17:30] michelp yeah, funny story about Fucking Austria, our managment wanted us to translate every city name in our database. i told them it didn't work that way, only a few cities in the world have "translations" like "Nuevo York". Most cities do not, so I used Fucking Austria as my example to convince them they were wrong. ;)
[17:30] private_meta hahaha
[17:30] michelp it worked perfectly. the best technical solutions are always zero effort :)
[17:31] michelp Los Angeles is a good one too, what are we going to translate that into English? Yeah I'm flying down to "The Angels" next week...
[17:32] private_meta Well, It's weird why some towns actually have to be translated
[17:33] private_meta why can't you just say "Wien", why did it have to be translated to "Vienna"?
[17:33] private_meta although I have to admit that the early name for Vienna was Vindobona, so a V is actually closer :D
[17:35] private_meta pieterh: This is like taking a pair of die and rolling them, with the outcome deciding if it works or not
[17:35] pieterh you mean city names, or debugging messages?
[17:35] private_meta well, both, in a way, but I meant the latter
[17:36] pieterh well, timing issues are like that
[17:36] private_meta I told you that adding a logging message made the error go away
[17:36] pieterh sure, it delays something long enough
[17:36] private_meta I did nothing else and then consisntently the erro went away
[17:37] pieterh sure, that also happens
[17:37] private_meta now I removed it
[17:37] private_meta now the error is still gone
[17:37] private_meta consisntently
[17:37] private_meta *consistently
[17:37] pieterh well, not consistently, obviously...
[17:37] private_meta Well, consistently meaning "no matter how many times i start it now"
[17:37] pieterh yes
[17:38] pieterh the best would be then to run it over and over in a loop
[17:38] pieterh it'll fail every N out of M times
[17:38] taotetek pieterh: hey, Ken told me he met you last week
[17:38] pieterh taotetek: hi!
[17:38] taotetek pieterh: he's who we have working on the rsyslog zeromq plugins
[17:38] private_meta pieterh: Yeah, I can ship it with a failure probability
[17:38] pieterh ah, yes, heh
[17:38] taotetek pieterh: I'm out in California this week he and I will be spending some time working on them more this thursday, woot
[17:39] private_meta cool... got a new error
[17:39] pieterh great guy
[17:39] private_meta basic_string::_S_construct NULL not valid
[17:39] taotetek pieterh: yeah he is I love working with him
[17:40] pieterh private_meta: you're working for Microsoft?
[17:40] private_meta pieterh: hey, don't let things get ugly
[17:40] pieterh then shipping with failure probability is probably not a great idea
[17:40] pieterh :)
[17:40] pieterh i meant, for debugging it
[17:41] private_meta working for a small research project
[18:13] private_meta It wouldn't be fun if there weren't a new problem whenever I try to get close to the current one...
[18:14] private_meta Now I can't Debug the server because of "Interrupted system call" exceptions
[18:14] private_meta pieterh: I get "Interrupted system call" exception whenever I use ctrl+z or whenever the debugger goes past "zmq::poll", any idea why?
[18:14] private_meta or maybe a combination thereof
[18:15] pieterh private_meta: debuggers do weird stuff with signals
[18:15] pieterh i tend to avoid them
[18:15] private_meta Well
[18:16] private_meta If I try to send the Stop Signal, like to put the application into background, the same thing happens
[18:16] private_meta terminate called after throwing an instance of 'zmq::error_t' what(): Interrupted system call
[18:17] pieterh private_meta: I've never tried Ctrl-Z on a zmq process... sounds fun
[18:18] private_meta the reason being to send it to the background
[18:18] private_meta ctrl+z, followed by a "bg" immeditely
[18:18] private_meta *immediately
[18:18] pieterh for sure, it makes sense... let me try that on a test case...
[18:20] private_meta I'll test some more too
[18:21] pieterh private_meta: well, it works fine on a C example (Ctrl-Z & bg)
[18:22] private_meta I fear it's another one of those redhad only problems
[18:23] private_meta I'll try ubuntu
[18:25] private_meta pieterh: ok... redhat enterprise linux 5 shows that problem
[18:25] private_meta using the latest zmq version
[18:26] private_meta and gcc 4.4
[18:27] private_meta Redhat is such a piece of shit, sorry for saying that
[18:28] pieterh private_meta: yeah, they are, but what's that got to do with their software?
[18:28] pieterh did you try on another Linux and it works?
[18:29] private_meta They manage to break the simplest stuff in their libraries, their repositories are ancient
[18:31] taotetek private_meta: redhat enterprise is simply older, as far as what versions of various packages you'll have.
[18:31] taotetek private_meta: have you looked at 6 by the way?
[18:31] private_meta No
[18:32] private_meta And, in RHEL 5 they manage to break loads of standard libraries with their changes they make before putting them into their repositories
[18:32] private_meta pieterh: It works on Ubuntu, I'd have to try other systems but I guess it works on them as well
[18:33] taotetek don't know on that. we have centos 5 in production and haven't had issues outside of a couple of compilation issues with things that required newer autoconf / libtool etc
[18:34] private_meta Do you know what might actually trigger that "interrupted system call" error in zmq?
[18:34] private_meta 'cause, apparently it does come from inside zmq
[18:34] pieterh IMO it's a signal sent to zmq_poll, which doesn't particularly handle it.
[18:35] pieterh presumably some versions of Linux send signals when you background a process
[18:35] pieterh like SIGHUP or something
[18:35] private_meta Well, ctrl+z DOES send a signal
[18:36] private_meta Afaik it sends SIGTSTP
[18:36] private_meta which means temporary stop
[18:39] pieterh so in CZMQ I catch various signals which means zmq_poll won't return EINTR
[18:39] pieterh actually it may still return EINTR, I forget
[18:40] private_meta Well, still... apparently sending signals 18, 20 or 24 to the process, "SIGTSTP" creates this problem, but ONLY on redhet
[18:40] private_meta *redhat
[18:42] pieterh much as I'd like to blame rhat it's probably not their fault
[18:50] private_meta pieterh: I know that is somewhat of an incomplete example, but it should show what I mean. When you press ctrl+z there, the error occurs on RHEL5
[18:52] pieterh well, zmq_poll returning EINTR is not abnormal afaik, it happens in various situations
[18:52] private_meta pieterh: In my opinion it shouldn't be normal behavior if it makes it terminate when trying to debug a program
[18:53] pieterh private_meta: it only terminates because you don't handle the EINTR
[18:53] pieterh there are a few threads on zeromq-dev about this, worth reading
[18:54] private_meta pieterh: so who handles it for me on other OSes if I don't?
[18:54] pieterh private_meta: different OSes (and kernels) send different signals
[18:56] private_meta I'll check which signals are sent exactly
[19:01] private_meta pieterh: do you know what signals zmq handles?
[19:02] pieterh afaik it doesn't handle any signals, it's the poll system call (and other blocking calls) that exit with EINTR
[19:03] pieterh so, for ctrl-C, czmq sets up a signal handler that flags the interrupt in a global variable
[19:03] pieterh thus, when you get control back from zmq_poll you can test for this interrupt, and clean up properly if it hits
[19:04] pieterh otherwise, ctrl-C will simply kill the process without a chance to clean up
[19:04] pieterh now, for SIGTSTP, I'd guess this is rare enough no-one's hit the problem before
[19:05] pieterh I'd say since you have to handle SIGINT anyhow, you can set-up a similar handler for SIGTSTP
[19:06] pieterh I guess zmq_poll will still return with EINTR but at least you'll be able to handle the situation
[19:06] pieterh this is the kind of stuff I'd expect a high-level C++ API to take care of
[19:06] private_meta I can't imagine that SIGTSTP is rare
[19:09] pieterh you may be able to just set up an ignore handler
[19:09] pieterh but I suspect poll will still exit
[19:10] private_meta I'm in contact with the #linux channel to check out what actually seems to transpire
[19:23] yukonbob if I stress socket creation/deletion, I get Too many open files
[19:23] yukonbob rc == 0 (mailbox.cpp:375)
[19:24] yukonbob are these lazily collected, and I'm just pressing it's limits?
[19:24] yukonbob , or do I have a flaw?
[19:33] yukonbob ... no indication of leaked memory on my part; sockets created via zmq_socket(), destroyed via zmq_close();
[19:49] pieterh yukonbob: IMO Linux keeps sockets around for a while after your process closes them
[19:51] yukonbob so, doesn't necessarilly indicate error w/ my use, nor zmq per se...
[19:51] pieterh right
[19:52] pieterh you can raise the number of files per process
[19:52] pieterh but you can still hit that by opening/closing sockets rapidly
[19:52] pieterh sure
[19:53] yukonbob pieterh: cool. Thx for feedback.
[19:54] private_meta pieterh: ok, I've pretty much assured myself that the signal sent is SIGTSTP, the same signal sent when debugging a program (and stepping) or pausing a program to move it to the background, and it definitely SHOULDN'T cause this program termination
[19:54] private_meta pieterh: however, on RedHat it appears to do that
[19:54] pieterh that sucks
[19:56] private_meta And I also looked it up
[19:56] private_meta it sends the same signal on Ubuntu as on RedHat
[19:56] private_meta it's both SIGTSTP, it's both on ctrl+z or other program suspension functionality, but it just screws up rhel
[19:57] pieterh I've not used RHAT since 1997 or so, and am quite happy to not have to use it
[19:58] private_meta But imho that's still not a reason for it not to work on that operating system :/
[20:11] jond private_meta: what does info signals show in gdb
[20:17] private_meta jond: I can tell you tomorrow, after 13.5 hours today, I had to get home from work
[20:23] jond private_meta: you might can use handle command to tell gdb to do different stuff, possibly handle SIGSTP nostop noprint pass would help I won't be around in the day (UK)
[20:23] jond s/might//g
[20:24] private_meta jond: I don't have access to the redhat server when I'm at home, at least not without LogMeIn, which I want to avoid
[20:25] jond private_meta: no problems, worth a try though
[20:25] private_meta I assume you mean SIGTSTP?
[20:25] jond yep
[20:25] private_meta what does "handle SIGTSTP nostop noprint pass" actually do?
[20:26] private_meta i mean, it seems it tells gdb to handle "SIGTSTP" in the following three ways
[20:26] private_meta but what do those count for?
[20:27] jond it basically tells the debugger not to handle it, not to print it and to just pass it on.
[20:27] private_meta so just to check if that makes the program continue as usual?
[20:28] jond yes. I've had trouble with gdb debugging and signals before, but not SIGTSTP
[20:29] private_meta Will you be online over the day?
[20:29] private_meta (I know you're not here, just asking about being online)
[20:30] private_meta just interested if a hilight with my results will be ok, or if i have to wait ;9
[20:30] jond i don't know because I start a new position tomorrow and I am no clear what security is in place or if I have a desk even yet....
[20:30] private_meta ah ok
[20:30] private_meta well, my client is online somewhere else, always on, so I'm always unsure about others ;)
[20:31] jond ok, we'll see tomorrow
[20:32] private_meta indeed we will
[20:37] cwb If I specify a pattern of message passing over 0MQ -- sending data of type A to ppull socket 1 will result in data of type B = f(A) being pushed to socket 2 and so on -- would you call that an API specification or a protocol?
[20:38] cwb Are (well specified) 0MQ patterns generally protocols?
[20:42] cwb I guess what I'm asking is: when should we say that a program that uses 0MQ for messaging has an API and when should we say that it implements a protocol? What's the distinction?
[20:46] private_meta hmm... I guess many of the people who can help you with that are online during daytime (gmt/ust)
[20:46] private_meta @ cwb
[20:52] pieterh cwb: hi
[20:53] cwb pieterh: hi!
[20:54] pieterh strictly speaking, the API tells you how to speak to objects like contexts and sockets
[20:54] pieterh whereas the protocols define how these talk to each other across some transport
[20:54] pieterh there is some overlap in semantics
[20:55] pieterh e.g. 'send' in the API can map to 'send' at the transport level
[20:55] pieterh does this help?
[20:56] cwb But the protocol send might be implemented with something like "snd" or require a few other implementation specific things?
[20:56] pieterh let's say, the socket semantics might be implemented with various protocols
[20:57] cwb Yep
[20:57] pieterh e.g. if you're working over inproc, tcp, or pgm, a publisher send to subscribers will use different protocols
[20:57] pieterh 0MQ kind of pretends it has a single TCP protocol (for example) but that's a little inaccurate
[20:57] cwb Ok..
[20:58] pieterh in fact the socket semantics create variations of the TCP protocol at some level
[20:58] pieterh this is not really documented in the formal protocol spec
[20:58] cwb Yes, that makes sense.
[20:58] cwb Ok.
[20:59] cwb Actually, my question was more about things that get implemented *using* 0MQ -- i.e. applications.
[20:59] pieterh so there are two classes of application
[21:00] cwb If my Python program uses 0MQ to talk to a C program -- would you say they are talking using APIs or a protocol?
[21:00] pieterh those that use the API and those that use the protocol
[21:00] pieterh your Python and C programs are using both the API and the protocol
[21:00] pieterh there are some 0MQ stacks that don't use the same API at all
[21:01] cwb Ok.
[21:01] pieterh e.g.
[21:01] cwb There's a bash binding? That's cool.
[21:02] pieterh Daniel Lundin made this, it just publishes a single message
[21:03] pieterh its API is 'zmq_push'
[21:03] pieterh and its protocol is, same as libzmq
[21:04] cwb Aha, that helps.
[21:04] cwb Excellent, thanks!
[21:05] pieterh app <==> API ---- protocol ---- API <==> app
[21:06] cwb Perfect.
[21:19] whack pieterh: haha, that 'bash' binding is quite funny
[21:20] pieterh whack: yeah, it's pretty cool
[21:21] pieterh It actually fits into twitter
[22:29] jsimmons which of the github repos am I supposed to be using
[22:58] Seta00 jsimmons, zeromq2-1 or libzmq for 0MQ 3.0