On 08/13/2018 10:20 PM, Michal Novotny wrote: > So I got to know on the flock that fedmsg is going to be replaced? > > Anyway, it seems that there is an idea to create schemas for the messages > and distribute them in packages? And those python packages need to be > present on producer as well as consumer? >> JSON schemas > >> Message bodies are JSON objects, that adhere to a schema. Message schemas > live in their own Python package, so they can be installed on the producer > and on the consumer. > > Could we instead just send the message schemas together with the message > content always? I considered this early on, but it seemed to me it didn't solve all the problems I wanted solved. Those problems are: 1. Make catching accidental schema changes as a publisher easy. 2. Make catching mis-behaving publishers on the consuming side easy. 3. Make changing the schema a painless process for publishers and consumers. Doing this would solve #1, but #2 and #3 are still a problem. As a consumer, I can validate the JSON in a message matches the JSON schema in the same message, but what does that get me? It doesn't seem any different (on the consumer side) than just parsing the JSON outright and trying to access whatever deserialized object I get. In the current proposal, consumers don't interact with the JSON at all, but with a higher-level Python API that gives publishers flexibility when altering their on-the-wire format. > > I would like to be able to parse any message I receive without some > additional packages installed. If I am about to start listening to a new > message type, I don't want to spend time to be looking up what i should > install to make it work. It should just work. Requiring to have some > packages with schemas installed on consumer and having to maintain them by > the producer does not seem that great idea. Mainly because one of the > raising requirements for fedmsg was that it should be made a generic > messaging framework easily usable outside of Fedora Infrastructure. We > should make it easy for anyone outside to be able to listen and understand > our messages so that they can react to them. Needing to have some python > packages installed (how are they going to be distributed PyPI + fedora ?) > seems to be just an unnecessary hassle. So can we send a schema with each > message as documentation and validation of the message itself? You can parse any message you receive without anything beyond a JSON parsing library. You can do that now and you'll be able to do that after the move. The problem with that is the JSON format might change. The schema alone doesn't solve the problem of changing formats, it just clearly documents what the message used to be and what it is now. I'd love for this to just work and I'm up for any suggestions to make it easier, but I do think we need to make sure any solution covers the three problems stated above. Finally, I do not want to create a generic messaging framework. I want something small that makes a generic messaging framework very easy to use for Fedora infrastructure specifically. I'm happy to help develop a generic framework (like Pika) when necessary, but I don't want to be in the business of authoring and maintaining a generic framework. > > a) it will make our life easier > > b) it will allow people outside of Fedora (that e.g. also don't tend to use > python) to consume our messages easily > > c) what if I am doing a ruby app, not python app, do I need then provide > ruby schema as well as python schema? What if a consumer is a ruby app? We > should only need to write a consumer and producer parts in different > languages. The message schemes should not be bound to a particular > language, otherwise we are just adding us more work when somebody wants to > use the messaging system in another language than python. I agree, and that's why I chose json-schema. A different language just needs to wrap the schema in accessor functions. An alternative (and something I wanted to propose longer term after the AMQP->ZMQ transition) is to use something like protocol buffers rather than JSON. The advantage there is a simplified schema format, it generally pushes into a pattern of backwards compatibility (thus reducing the need for a higher level API), and it auto-generates an object wrapper in many languages. You still need to potentially implement wrappers for access if you change the schema in a way that isn't additive, though. You may notice (and it's not an accident) that the recommended implementation of a Message produces an API that is very similar to the one produced by a Python object generated by protocol buffers. This makes it possible to quietly change to protocol buffers without breaking consumers, assuming they're not digging into the JSON. I'm not saying we'll definitely do that, but it is still on the table and a transition _should_ be easy. The big problem is that right now the majority of messages are not formatted in a way that makes sense and really need to be changed to be simple, flat structures that contain the information services need and nothing they don't. I'd like to get those fixed in a way that doesn't require massive coordinated changes in apps. Anyway, to summarize, I really really want this to be super easy to use and just work. I hope we can improve it further and I'd love to hear your thoughts. Do you think my problem statements and design goals are reasonable? Given those, do you still feel like sending the schema along is worthwhile? -- Jeremy Cline XMPP: jeremy@xxxxxxxxxx IRC: jcline _______________________________________________ infrastructure mailing list -- infrastructure@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to infrastructure-leave@xxxxxxxxxxxxxxxxxxxxxxx Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@xxxxxxxxxxxxxxxxxxxxxxx/message/H2SBF4ABCDXCXDDTUCVMJ46ATKK3JG5L/