Hipster Data Lab

Jun 17, 2023

I think it is in the details and what you need.

The major difference is that Amplitude loads your data into Amplitude and Kubit runs a query in your warehouse and receives the data.

This has different impacts.

- privacy-wise, having all in your DWH is preferable

- performance-wise on big data Amplitude's approach might be quicker // but needs benchmarks

- running on top of the DWH gives you more flexibility for extending the data model (AMP needs a full sync)

- you can use different leading identifiers in Kubit, like user, account, product, campaign or whatever next to each other

Expand full comment

Martand Singhal

Jun 17, 2023Edited

Thanks for the detailed reply. I get the difference now. However, as with anything, there is a downside to direct-query in dwh approach.

I think tools like amplitude that pulls data creates silos that are essential. These silos are opinionated way of doing analytics and they obviously dont fulfil all user needs. And I always see the role of dwh as something that fills the gap created by these silos. All these dwh-first approaches have a downside that data engineering team has to now support use cases which were otherwise out-of-the-box in tools like amplitude. And this adds a complexity of exponential magnitude to the ressources and effort required by the data engineering team. The ROI is not at all justifiable in these scenarios! There are exceptions to this but 99% of the companies in the world will struggle with pure dwh-first approach. What do you think ?

Expand full comment

Jun 20, 2023

Good point. But I don't see the product analytics as silo as essential.

There is one paradigm problem with the DWH. It is usually built for BI use cases. This means that events are second-class citizens.

But for event data or explorative analytics, you need the events as the core element. Most of the used data modeling approaches do not support that. And just giving access to the Segment or Rudderstack event tables is missing the magic I am looking for.

Using something like Activity Schema helps to change that. But it needs adoption.

But it can be built on top of the existing data models, and then the additional costs are manageable.

But with anything - it really depends on the case.

Expand full comment

Martand Singhal

Jun 20, 2023

For all kinds of product analytics and marketing analytics that often require adding data from Google ads, fb ads, salesforce, mailchimp data, I think amplitude offers a nice out-of-the-box solution. If we try to build the same thing in dwh + BI tool then I think added (data engineering + data analysts) costs will be difficult to justify. Isn't it?

Expand full comment

Jun 21, 2023

When out-of-the-box works for your setup, cost-wise it is usually the better option. If you have a high event volume, it might be already different.

If you use smaller and different marketing platforms, you depend on getting an integration.

So, in the end, it is up to you to decide when a stack works and when the limitations become real blockers.

There are no best solutions in general, only in context.

Expand full comment

I don't see how Kubit does away with something like Rudderstack or Segment, aren't there still a need for a unified tracking layer of some kind?

Expand full comment

You need event data. So you still need a way to get events into your DWH. But if you, for example, just collect events from your application databases, there is no need for Segment or Rudderstack.

Expand full comment

Sure, but that's always been the case. :) More a question of rolling your own or using an SDK. Do feel like a tracking layer helps with controlling different SDKs/attribution sdks/etc and keeps it tidy - so was unsure if Kubit did away with that selling point.

Expand full comment