Skip to content

Import Web Content

Train your agent with websites from the internet!

Terminal window
curl --request POST \
--url https://chat.api.toolzz.com.br/api/v2/extractor/upload-web-scraper \
--header 'Authorization: Bearer TOKEN_HERE' \
--header 'Content-Type: application/json' \
--data '
{
"unityId": "<string>",
"datasetId": "<string>",
"folderId": "<string>",
"url": "<string>",
"limit": 123
}
'
{
"message": "<string>",
"folder": {
"id": "<string>",
"name": "<string>",
"isRoot": true,
"knowLedgeBaseId": "<string>"
},
"urls": [
{}
],
"files": [
{
"id": "<string>",
"status": "<string>",
"fileName": "<string>",
"maskName": "<string>",
"url": "<string>",
"size": 123,
"extension": "<string>",
"createdAt": "<string>",
"kbFolder": {
"name": "<string>"
}
}
]
}
ParameterTypeDescriptionRequired
AuthorizationStringAccess token (“Bearer” must be before the token)Yes
ParameterTypeDescriptionRequired
unityIdUUIDUnique identifier of the owning organizational unit.Yes
datasetIdUUIDTarget Knowledge Base (Dataset) identifier.Yes
folderIdUUIDIdentifier of the specific folder where the content will be saved.Yes
urlURLThe link to the website or web page that will be imported and processed.Yes
limitNumberOptional depth or page limit for scraping.No
KeyTypeDescription
messageStringConfirmation message (e.g., “Web content imported successfully”).
folder.idUUIDUnique identifier of the folder where the contents were saved.
folder.nameStringName of the folder that received the import.
folder.isRootBooleanIndicates if the import was made in the root folder of the base.
folder.knowLedgeBaseIdUUIDID of the Knowledge Base that owns the content.
files[].idUUIDUnique ID of the record generated from the web content.
files[].statusStringProcessing state (e.g., SUCCESS, ERROR, PROCESSING).
files[].fileNameStringTechnical name of the .txt or .html file generated to store the page content.
files[].maskNameStringWeb page title or friendly display name.
files[].urlURLLink to the generated file containing the extracted text from the site.
files[].sizeNumberExtracted content size in bytes.
files[].extensionStringGenerated text file extension (usually txt).
files[].createdAtStringImport date and time (ISO 8601).
files[].kbFolder.nameStringName of the folder associated with the import record.
urls[]ArrayList of references to the original links sent in the request.

To access this endpoint, it is necessary to send a valid access token through the authorization header (Authorization) of the request. Additionally, the API is protected by other security measures to safeguard user data.

To access your access token, follow these steps:

  1. Log in to the ToolzzAI platform
  2. Click on “Settings”
  3. Click on “Access Token”
  4. Copy the access token

Access token page