-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Create Cassandra db schema on session initialization #5922
base: main
Are you sure you want to change the base?
[WIP] Create Cassandra db schema on session initialization #5922
Conversation
plugin/storage/cassandra/factory.go
Outdated
return result | ||
} | ||
|
||
func constructQueriesFromTemplateFiles(session cassandra.Session, params *StorageConfigParams) ([]cassandra.Query, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is cassandra.Session not able to execute multiple queries at once?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you talking about running individual queries in parallel or a batch query option? I didn't try executing parallel queries.
69275fb
to
bcad4c0
Compare
f9dd90e
to
90368b1
Compare
…ution for initialize database Signed-off-by: Alok Kumar Singh <[email protected]>
90368b1
to
afc786d
Compare
Signed-off-by: Alok Kumar Singh <[email protected]>
Signed-off-by: Alok Kumar Singh <[email protected]>
Signed-off-by: Alok Kumar Singh <[email protected]>
//Datacenter is the name for network topology | ||
Datacenter string `mapstructure:"datacenter" valid:"optional"` | ||
// TraceTTL is Time To Live (TTL) for the trace data in seconds | ||
TraceTTL int `mapstructure:"trace_ttl" valid:"optional"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can the type here be time.Duration
so that the user could specify 72h
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am assuming that there should be a validation to make sure that user can't specify something like "ms"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's the point of using strong type - the business logic does not need to worry about validations, they should happen separately during parsing. I believe if you simply change int
to time.Duration
and use 10ms
as a value in YAML, the parser will do the right thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TraceTTL and DependecyTTL sets: default_time_to_live
(https://cassandra.apache.org/doc/latest/cassandra/developing/cql/ddl.html) which should be supplied in seconds. Won't making it time.Duration and allowing users to use something like 10ms
complicate it? Like we have to block more precise duration than seconds like millisecond
or nanosecond
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the option is either we name the field trace_ttl_seconds
and force the user to deal with ridiculous numbers like 1123200 (2w - can you tell? I can't without doing math), or we keep a clean name trace_ttl
and let the user specify easy-to-read values like 24h
or 14d
// CasVersion is version of cassandra used | ||
CasVersion int `mapstructure:"cas_version" valid:"optional"` | ||
// CompactionWindow of format "^[0-9]+[mhd]$" tells the compaction window of the db | ||
CompactionWindow string `mapstructure:"compaction_window" valid:"optional"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so this is defining a time interval? Can we then also use time.Duration type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only time precision form https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/TimeUnit.html are allowed as per https://cassandra.apache.org/doc/latest/cassandra/managing/operating/compaction/twcs.html.
Also, below is based on the current script used. Should we go ahead with breaking it?
// CompactionWindow of format "^[0-9]+[mhd]$" tells the compaction window of the db
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we are creating new config, not bound by previous restrictions. If the destination API does not allow smaller units we can always add validation cw >= time.Minute
Signed-off-by: Alok Kumar Singh <[email protected]>
Signed-off-by: Alok Kumar Singh <[email protected]>
Signed-off-by: Alok Kumar Singh <[email protected]>
Signed-off-by: Alok Kumar Singh <[email protected]>
@@ -0,0 +1,43 @@ | |||
-- There are total 4 queries here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's the point of test schema?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of testing on all the queries and manually creating expected result for all of them, I reduced the number of queries to ease the creation of expected result.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that doesn't really make sense to me. We can either test that the template is correct or that the code using the template is correct. You're not doing the former by having another template, and the latter you can do with the primary template. We don't need to validate that the output of running template is "as expected", that's like testing the Go template package.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need to validate that the output of running template is "as expected", that's like testing the Go template package.
We are also removing comments and constructing individual query strings out of it by iterating over lines in the template output. The test checks this "individual query string" construction, which not just involves using the template package.
and the latter you can do with the primary template.
Sure, I think I can write an integration test for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @yurishkuro , I have the integration test script ready. Should I merge it with the current cassandra-intergration-test.sh
? This would require adding an additional 4th parameter skip_apply_schema
for running the script
Signed-off-by: Alok Kumar Singh <[email protected]>
Signed-off-by: Alok Kumar Singh <[email protected]>
Signed-off-by: Alok Kumar Singh <[email protected]>
Signed-off-by: Alok Kumar Singh <[email protected]>
Signed-off-by: Alok Kumar Singh <[email protected]>
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #5922 +/- ##
==========================================
- Coverage 96.41% 96.18% -0.24%
==========================================
Files 353 355 +2
Lines 20135 20310 +175
==========================================
+ Hits 19414 19535 +121
- Misses 535 572 +37
- Partials 186 203 +17
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
Signed-off-by: Alok Kumar Singh <[email protected]>
Signed-off-by: Alok Kumar Singh <[email protected]>
// NewSession creates a new Cassandra session | ||
func (c *Configuration) NewSession() (cassandra.Session, error) { | ||
err := c.newSessionPrerequisites() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is redundant. Create the session as before and call GenerateSchemaIfNotPresent just before returning from NewSession
@@ -0,0 +1,43 @@ | |||
-- There are total 4 queries here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I already asked - why do we need this file?
@@ -58,6 +58,18 @@ type Schema struct { | |||
// while connecting to the Cassandra Cluster. This is useful for connecting to clusters, like Azure Cosmos DB, | |||
// that do not support SnappyCompression. | |||
DisableCompression bool `mapstructure:"disable_compression"` | |||
// Datacenter is the name for network topology |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need to add CreateSchema bool
(default false) for backwards compatibility
CompactionWindowUnit string `mapstructure:"compaction_window_unit" valid:"optional"` | ||
} | ||
|
||
func DefaultParams() TemplateParams { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not needed. There is DefaultConfig function somewhere, it should populate these settings there.
|
||
expOutputQueries := []string{ | ||
`CREATE TYPE IF NOT EXISTS jaeger_v1_dc1.keyvalue ( | ||
key text, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all these details are not relevant for the unit test - after all we don't know if they are correct anyway since we're not testing against a real DB (but the integration test will test against it). So at most you can verify that CREATE commands are issued for expected db objects, but there's no need to match full queries.
s) | ||
SKIP_APPLY_SCHEMA="true" | ||
;; | ||
*) | ||
;; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please use two-space indentation, not tabs
Create Schema (if not present) on Session Initialization
Once a session is established with cassandra db, the added code parses the template file containing queries for creating schema and create queries out of it. Post which it executes those queries to create the required types and tables.
Which problem is this PR solving?
Resolves #5797
Description of the changes
How was this change tested?
Checklist
jaeger
:make lint test
jaeger-ui
:yarn lint
andyarn test