Accelerating Splunk Dashboards with Base Searches and Saved Searches. Let’s say we have a Splunk dashboard with multiple panels. Each panel has its own search request and all of these requests work independently and simultaneously. If they are complex enough, rendering the dashboard may take quite a long time and some panels may even fall by timeout.
How to avoid this? The first step is to understand how the searches are related. May be it is possible to select some base searches, and reuse their results in other child-searches. It’s also possible to get cached results from the “Saved Searches” (another name of Reports in Splunk GUI).
Base Searches
We can set an id for some search in a Splunk Dashboard panel:
<search id=”long_running_search”><query>…</query></search>
In other panels, we can use the results of a base search like this:
<search base=”long_running_search”><query>…</query></search>
The child-search with a base parameter will wait until the related base search is completed and then will execute own request using base search results as an input. Note that the request in the child-search should NOT begin with “|”.
The child-search can be a base for another search. So, the search in panel can have attributes id and base at the same time:
<search id=”my_search” base=”long_running_search” ><query>…</query></search>
Saved Searches
Base searches can help to eliminate unnecessary requests, but they don’t solve the main issue: what if the base search request itself takes a lot of time to execute. It is especially sad to run it each time on rendering the dashboard, if the actual data does not change often: once a day or an hour. For example in a case of software installation data, which I mention as in “How to create and manage Splunk dashboards via API” and “Asset Inventory for Internal Network: problems with Active Scanning and advantages of Splunk“.
It would be much better if we could save the results of the searches and update them periodically, right? And it’s possible. The idea is that we can create a report (or saved search, what is the same), configure it to run on schedule and use the results in other searches. We can get the results by search name using loadjob function:
| loadjob savedsearch=”splunk_api_user:test_application:my_saved_search”
Or by it’s sid:
| loadjob “scheduler_YS5sZW9ub3Yx_dGVz449hcHBsaWNhdGlvbg__RMD5a592c12e6493315d_at_1534165600_7312”
But to update the dashboards automatically (I think that this is the most optimal way – manual editing is too hard and routine), we will have to work with saved searches via Splunk API.
Splunk Saved Searches API
In order to work with the saved search, we need to know the application, where the search will be created, and its author. This the same as for dashboards, that I described in “How to create and manage Splunk dashboards via API“.
We have a Splunk server and user account for automation:
import json
import requests
splunk_server = "https://splunk.corporation.com:8089"
user = "username"
password = "password"
Getting application name and the author:
application_label = u"My Dashboard"
print(application_label)
app_name = ""
app_author = ""
data = {'output_mode': 'json'}
response = requests.get(splunk_server + '/services/apps/local?count=-1', data=data,
auth=(user, password), verify=False)
for entry in json.loads(response.text)['entry']:
if entry['content']['label'] == application_label:
app_name = entry['name']
app_author = entry['author']
print(app_name)
print(app_author)
Now we can create a new saved search. It’s is easier to set time limits directly in the text of search request. For example, if we want data for the last 5 days, we add “earliest = -5d” in the beginning:
search_name = 'my_saved_search'
search_request = 'earliest=-5d index="important_index"'
data = {'output_mode': 'json',
'name': search_name,
'search': search_request,
}
response = requests.post(splunk_server + '/servicesNS/' + app_author + '/' + app_name + '/saved/searches', data=data,
auth=(user, password), verify=False)
print(response.text)
I will not show the output of the command here. It’s very long and contain many parameters of the saved search that are not actually used. The same parameters we will see using a GET request with the same saved search name:
data = {'output_mode': 'json'}
response = requests.get(splunk_server + '/servicesNS/' + app_author + '/' + app_name + '/saved/searches/' + search_name, data=data,
auth=(user, password), verify=False)
Or in Splunk GUI :
‘| rest /servicesNS/-/-/saved/searches | where title = “my_saved_search”‘
We can edit search and make it run regularly with launch time set in cron format:
cron = "00 12 * * *"
data = {'output_mode': 'json',
'cron_schedule': cron,
'is_scheduled': 'true'
}
response = requests.post(splunk_server + '/servicesNS/' + app_author + '/' + app_name + '/saved/searches/' + search_name, data=data,
auth=(user, password), verify=False)
print(response.text)
A new saved search will not contain any data. In theory, we can fill it using dispatch request:
data = {'output_mode': 'json',
'trigger_actions': '1'
}
response = requests.post(splunk_server + '/servicesNS/' + app_author + '/' + app_name + '/saved/searches/' + search_name + '/dispatch', data=data,
auth=(user, password), verify=False)
print(response.text)
But even after that, I did not get the data from this search by it’s name, only by sid. It is not very practical. So, I just scheduled the search to run at the current time + 2 minutes for debugging. And then scheduled the search to necessary time.
Finally we can delete saved search by it’s name:
data = {'output_mode': 'json'}
response = requests.delete(splunk_server + '/servicesNS/' + app_author + '/' + app_name + '/saved/searches/' + search_name, data=data,
auth=(user, password), verify=False)
print(response.text)
If everything is fine it will return:
{"links":{"create":"/servicesNS/nobody/my_application/saved/searches/_new","_reload":"/servicesNS/nobody/my_application/saved/searches/_reload","_acl":"/servicesNS/nobody/my_application/saved/searches/_acl"},"origin":"https://splunk.corporation.com:8089/servicesNS/nobody/my_application/saved/searches","updated":"2018-10-12T13:14:37+03:00","generator":{"build":"03bbabbd5c0f","version":"7.0.2"},"entry":[],"paging":{"total":0,"perPage":30,"offset":0},"messages":[]}
Knowing these basic operations for dashboards and searches we can automatically create and update them from own Python scripts.
Hi! My name is Alexander and I am a Vulnerability Management specialist. You can read more about me here. Currently, the best way to follow me is my Telegram channel @avleonovcom. I update it more often than this site. If you haven’t used Telegram yet, give it a try. It’s great. You can discuss my posts or ask questions at @avleonovchat.
А всех русскоязычных я приглашаю в ещё один телеграмм канал @avleonovrus, первым делом теперь пишу туда.
Pingback: Splunk Discovery Day Moscow 2018 | Alexander V. Leonov
Pingback: Creating Splunk Alerts using API | Alexander V. Leonov
Hello,
loadjob is fast but the only issue is that if a form allows the user to chose time frame the loadjob command will always provide the result of the savedsearch from the scheduled run. It is good for dashboard but not always good for form with time picker.
Pingback: How to list, create, update and delete Grafana dashboards via API | Alexander V. Leonov