Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

To give you another take on Elijah's answer:

> acquiring and moving data around,

yep with Hamilton we provide the ability to cleanly separate bits of logic that's required to change and update this. For example you'd write "data loader functions/modules" that are implementations for say reading from a DB, or a flat file, some vendor. If they output a standardized data structure, then the rest of your workflow would not be coupled to the implementation, but the common structure which Hamilton forces you to define. That way you can be pretty surgical with changes and understanding impacts.

Regarding assessing impacts, Hamilton provides the ability to "visualize" and query for lineage as defined by your Hamilton functions. We think that with Hamilton we can make the "hey what does this impact?" question really easy to answer, so that when you do need to make changes you'll have more confidence in doing them.

> managing the data over the lifetime of a model’s use,

Hamilton isn't opinionated about where data is stored. But given that if you define the flow of computation with Hamilton and use a version control system like git to version it, then all you need to then additionally track is what configuration your Hamilton code was run with, and associate those two with the produced materialized data/artifact (i.e. git SHA + config + materialized artifact), you have a good base with which to answer and ask queries of what data was used when and where. Rather than bringing in 3rd party systems to help here, we think there's a lot you can leverage with Hamilton to help here.

For example, we have users looking at Hamilton to help answering governance concerns with models produced.

> and serving adjacent needs (e.g. post-deployment analytics).

If it's offline, then you can model and run that with Hamilton. The idea is to help provide integrations with whatever MLOps system here to make it easy to swap out.

For online, e.g. a web-service, you could model the dataflow with Hamilton, and the build your own custom "compilation" to take Hamilton and project it onto a topology. During the projection, you could insert whatever monitoring concerns you'd want. So just to say, this part isn't straightforward right now, but there is a path to addressing it.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: