I am collecting data coming from multiple data sources, the data lands in my data lake.
I need to organize all the where clause of all the queries, for example one dataset is about employee data, when i create models for this dataset my query would look like
– Employees hired in 1986
SELECT hire_date, first_name, last_name, emp_no
WHERE Year(hire_date) = ‘1986’
– All employees in Sales (employee number, last name, first name, and department name)
SELECT e.emp_no, e.last_name, e.first_name, x.dept_name
FROM employees e
LEFT JOIN dept_emp d ON e.emp_no = d.emp_no
LEFT JOIN departments x ON d.dept_no = x.dept_no
WHERE x.dept_name = ‘Sales’
Each dataset have its own set of where clauses, what is the best practice to organize all those where clauses, in order to store them in an organized way and embed those rules dynamically in the model. I got lots of those where clauses , some are unique for the dataset and few are common between datasets.