Logo Tech Writers

tech writers

This is our blog for technology lovers! Here, softplayers and other specialists share knowledge that is fundamental for the development of this community.

Learn more
RabbitMQ Simulator: A useful tool for planning messaging systems
Tech Writers October

RabbitMQ Simulator: A useful tool for planning messaging systems

The need to create more resilient and decoupled software has driven the adoption of messaging, a communication standard where different applications or parts of the system “talk” to each other through messages, asynchronously and without losing information during the process¹. One of the most used protocols for this is AMQP 0-9-1 (Advanced Message Queuing Protocol) and is based on four elements: producer, exchange, queue and consumer². Each of them plays an essential role in the message flow: Producer: is the application — or part of a system — that originates the communication and sends the messages. He is responsible for publishing the information on the exchange, without worrying about who will receive it. Exchange: acts as a message router, receiving messages from the producer and directing them to one or more queues, depending on their type (Fanout, Topic, Direct or Headers). Queue: is the temporary repository where messages are stored until they are consumed. There may be one or more queues, depending on the system architecture. Consumer: is the application that receives and processes messages from the queue, completing the communication cycle. The set of these components allows establishing asynchronous communication between different parts of a system, without needing to know the implementation details of each one. As a result, many tools have emerged to implement this architecture, and one of the most popular is RabbitMQ. Simulating how this communication works visually can help with proper solution planning to meet customer needs. For this purpose, there is a website called “RabbitMQ Simulator”, which, as the name suggests, simulates how messaging works between different applications. In this article, you will see a step-by-step guide on how to simulate this communication and use it in your projects and/or studies. Steps to use the RabbitMQ Simulator: Go to the page: https://tryrabbitmq.com/; Create the producer: The page shows a white square where the simulation will take place and the elements that make up the architecture in icons on the left side. Drag the “producer” icon to the center and observe the circle that represents it. More than one producer can be created. To edit the name, just change the text where the red arrow is pointing in the image. Let's create two producers: “Producer 1” and “Producer 2”. Image 1 – Adding two producers to simulation 3. Create the Consumer Drag the “consumer” icon to the center and the yellow icon will appear. More than one consumer can be created. To edit the name, just change the text where the red arrow is pointing in the image. Let's create two consumers: “Consumer 1” and “Consumer 2”. Image 2 – Adding two consumers to simulation 4. Create the queue The queue is an important element in this architecture. This is where messages will be posted to be sent to consumers. To create, just click on the blue icon. You can edit the queue name, which is shown by the red arrow in the following image. Notice there's a message counter above the 'msg:0' icon? It represents the number of messages held in each queue - that is, those that have not yet been processed by consumers. If messages aren't consumed, this number can grow infinitely, like a full inbox. This impacts system performance, as queues consume memory and, in extreme cases, can even exhaust server resources. Therefore, in real systems, it is common to configure size limits for queues or policies to discard old messages. Image 3 – Adding the queue to simulation 5. Create Exchange As we have seen, the exchange is the element that stands between the producer and the queue and defines how messages will be distributed between the queues, and which supports different types of exchanges, each with specific routing rules. Next, we'll explore the Fanout, Topic, and Direct types. 5.1) Fanout The fanout exchange sends the received message to all queues. It can be applied in systems where all interested parties must receive all messages. To create an exchange, just drag the icon that represents it to the center of the square. You can choose the name and its type. In this first example, it will be of the Fanout type. Image 4 – Creating a fanout exchange It is necessary to create a union between consumer, queue, exchange and producer. To do this, hold down the Alt key and drag the mouse from the consumer to the queue, then from the queue to the exchange, and from the producer to the exchange. Arrows will appear representing the connection of these elements, as we can see in the image below. One more queue was created to better represent our example, and each queue was associated with a consumer. To trigger an example message, just click on “Producer 1”, go to “New Message” and define the message body, its “routing key” and the time interval in seconds for it to be sent. We can see that our Fanout exchange distributes the message to all queues. Image 5 – Firing message 5.2) Topic A topic uses a routing system based on two keys: (i) “routing keys”, a configuration attached to the message by the producer, which functions as a “message address” and is identified by the exchange; and (ii) “Binding keys” that are configured in the queue when linking it to the exchange, are like filters that help direct the sending of messages to the respective queues. To simulate this type, we click on the exchange we created and change its name to “Exchange topic”, which will require configuring the “binding keys”. By clicking on the arrows with the term “binding key”, we can change it to the name we are interested in. Following our example, a queue will be created to receive messages with the key “produc.log” that feeds our client's logistics system and the other “produc.fin” to send data to the client's finance team. By clicking on Producer 1, we configure the new message with routing key “product.log”. Please note that the exchange only sends to the corresponding queue. Image 6 – Message going to the corresponding queue (product.log) Now, if we configure Producer 2 to send messages with “routing key” “product.fin”, and change the “binding key” of queue 1 to “product.*” we will see that the message will go to all queues that begin with the term “product.”. Image 7 – Message going to all queues that start with “product.” 5.3) Direct Finally, the direct exchange directs the message only to one or more queues that are linked by a specific “routing key”. Here the match must be exact. And it is adopted in systems where the message is directed to specific consumers. Following the previous example, if Producer 2 sends a message with “routing key” product.log, only the queue with that “binding key” will receive it. Image 8 – Creating a direct-type exchange By allowing the design of messaging models for different configurations, the RabbitMQ Simulator becomes a useful tool to assist developers and software architects in designing systems based on this communication standard. Finally, its didactic way of presenting the elements of the AMQP 0-9-1 protocol can help in the studies of those who want to know more about this technology. In this article, the examples mentioned were only to present this simulator, but they serve as an invitation for other developers to explore this tool in their projects. Sources: ¹Development of a message broker. Giovanni Henrique Silva Oliveira. University of Coimbra, 2021. https://transportersystems.com/docs/dissertacao-giovanni-oliveira.pdf ²AMQP 0-9-1 Model Explained. RabbitMQ Documentation. Broadcom, 2025. https://www.rabbitmq.com/tutorials/amqp-concepts

CDC - Change Data Capture
Tech Writers October

CDC - Change Data Capture

Introduction Change Data Capture (CDC) is a technique that identifies and records all changes (inserts, updates, and deletes) made to a data source, such as a relational database, and transmits these changes to other systems, enabling real-time synchronization. This approach is fundamental to many modern data architectures that require up-to-date and consistent information across disparate systems. Benefits CDC brings a series of benefits that go beyond simple data movement. The main benefits can be seen in the list below. Reduced load on the source system by capturing only changes and avoiding massive data readings, resulting in lower infrastructure costs. Real-time data replication, eliminating the latency that typically exists in traditional ETL processes. Ability to track changes for auditing, backup, and reprocessing on other platforms. Continuous updates to data lakes and data warehouses, feeding analytical systems with current data. Main techniques CDC implementation can be carried out in different ways, each with specific advantages and limitations. Knowing these alternatives helps you select the most appropriate solution for each organization's context. The most common techniques are presented below. Reading transaction logs: captures changes directly from transaction logs, a method used by tools such as Debezium. Timestamp columns: Use of a column that records the last modification in each row. Ideal for systems that do not have detailed transactional logs. Triggers: use of triggers to record changes in tables, although this may impact performance. Snapshot Differencing: Compares snapshots taken periodically. It's a heavier, less precise approach. Binlog Techniques (Transaction Log) The binlog, or transaction log, is a file that records all operations performed on the database. Databases like MySQL (binlog) and PostgreSQL (WAL) have transactional logs that can be monitored in real time to capture changes. This approach is efficient and accurate because it captures every change made to the data, but it requires access to the logs and read permissions. The downside is that not all banks have such detailed transaction logs, and configuration may require infrastructure adjustments. Timestamp Column Consists of creating a column that records the last change made to each row in a table. By querying only the rows changed since the last run, the system captures incremental changes. This technique is simple and applicable to any database, but it does not capture data deletions and requires that all monitored tables include this column. Triggers Triggers are functions defined in the database that perform actions when a specific event occurs (such as an insert, update, or delete). With CDC, triggers can be configured to record these changes in a history table. While efficient, this method is often avoided due to its performance impact and maintenance complexity. Diff between Snapshots Consists of creating periodic snapshots of the data and comparing them to identify changes. It is a laborious method with a greater impact on the system, as it requires intensive processing, and is only used when other CDC approaches are not available. Tools Below are some of the main Change Data Capture tools on the market. Debezium: Open-source tool that uses transaction logs to capture changes in real time. It supports databases like MySQL, PostgreSQL, MongoDB, and others, and is integrated with Kafka, which facilitates data transmission in distributed data pipelines. Oracle GoldenGate: Oracle's proprietary tool for real-time data replication and integration, compatible with multiple database systems. StreamSets: Integration platform that includes components for CDC in various databases and data sources. Qlik Replicate (Attunity): Robust CDC solution used for replication and data synchronization between databases. Amazon DMS and Azure Data Factory: CDC solutions offered on their respective cloud platforms for replication and data synchronization between services. Airbyte: Airbyte is an open-source data integration platform that supports CDC through various connectors, including transaction log monitoring for databases such as MySQL and PostgreSQL. It's a flexible and scalable option, with a user-friendly interface that makes it easy to create data pipelines. Airbyte has a wide library of connectors and stands out for its active community and customization possibilities. Striim: Striim is a real-time data integration platform that combines CDC with streaming analytics and event processing. It supports multiple data sources, such as relational databases, and can stream data to multiple platforms, including clouds like AWS, Azure, and Google Cloud. Its robust architecture enables large-scale data replication and real-time processing, making it popular for use cases that require instant insights from data. While many tools exist, we'll focus in this article on Debezium, a popular open-source option built on top of Kafka Connect, to illustrate a practical implementation. Kafka Connector Kafka Connector is a core feature of Kafka Connect, an Apache Kafka Platform tool designed to simplify and automate integration between Kafka and other systems, such as databases, data warehouses, cloud storage systems, and more. Kafka Connect serves as a distributed, scalable integration layer that uses specialized connectors to reliably transfer data between systems. How Kafka Connect works Kafka Connect uses connectors, which are specialized plugins for each type of system or data source, divided into two categories: Source Connectors: Capture data from an external source and publish it to Kafka topics. Sink Connectors: Consume data from Kafka topics and write it to a target system. Connectors are configured using JSON configuration files or REST APIs, where you can define the data source or destination, authentication information, replication settings, and optional transformations to conform the data to the desired format. Kafka Connect Architecture The Kafka Connect architecture is composed of workers, which are instances running connectors in a distributed manner. This allows Kafka Connect to: Scale horizontally as data load increases. Be fault-tolerant: if one worker fails, another can take over. Centralize connector configuration and monitoring. Data is processed in blocks called tasks, which distribute the workload among workers and enable parallel processing. "Not everything is roses" (limitations) Limited options There are less than 20 different types of connectors in Kafka Connect. If a generic connector is not suitable, you must develop a custom one. Similarly, Kafka Connect only provides a basic set of transformations, and you may have to write your own custom transformations as well. This can increase the time and effort required for implementation. Configuration Complexity Kafka Connect configurations quickly become complex, especially when dealing with multiple connectors, tasks, and transformations. This can make it difficult to manage, maintain, and troubleshoot the system. It's also challenging to integrate Kafka Connect with other tools your organization may already be using. This creates a barrier to adoption and slows down the implementation process. Limited error handling support Kafka's primary focus is streaming data, which results in limited built-in error handling capabilities. The distributed nature and potential interdependencies between various Kafka components increase complexity, making it difficult to discover the root cause of an error. You may have to implement custom error handling mechanisms, which can be time-consuming and not always as reliable as the built-in capabilities provided by other data integration and ETL tools. Recovering gracefully from complex errors is also challenging in Kafka, as there may not be a clear path to resume data processing after an error occurs. This can lead to inefficiencies and require more manual intervention to restore normal operations. Performance Issues Most applications require high-speed data pipelines for real-time use cases. However, depending on the connector type and data volume, Kafka Connect may introduce some latency into your system. This may not be suitable if your application cannot tolerate delays. "But there is a solution" (best practices) Even with these limitations, it would be incorrect to say that Kafka Connect is not a good choice; it has numerous benefits that far outweigh its limitations. Kafka Connect is a powerful tool for building scalable data pipelines, but ensuring successful implementation and operational adoption requires following best practices. Here are some tips and advice to avoid mistakes and achieve success with Kafka Connect. Plan your pipeline Before you begin the implementation process, determine the sources and destinations of your data and ensure your pipeline is scalable and can handle increasing data volumes. Also, make sure you provision the infrastructure properly. Kafka Connect requires a scalable and reliable infrastructure for optimal performance. Use a cluster-based architecture with sufficient resources to handle the data pipeline load and ensure your infrastructure is highly available. Use a Schema Registry Using a Schema Registry in Kafka Connect can be beneficial because it allows centralized storage and management of schemas. This helps maintain schema consistency across systems, reduces the likelihood of data corruption, and simplifies schema evolution. Configure for Performance It's important to design and configure your Kafka Connect for future scale and performance. For example, you can implement a suitable partitioning strategy to distribute your data evenly across partitions. This will help your Kafka Connect cluster manage high volume of data effectively. Likewise, you can use an efficient serialization format like Avro or Protobuf to transfer it faster over the network. Monitoring Monitoring is essential to ensure the smooth operation of your data pipeline. Use monitoring tools to track the performance of your Kafka Connect deployment and quickly identify and resolve any issues. Kafka Connect monitoring options: Prometheus + Grafana Confluent Control Center Lenses.io Elastic Stack (ELK) Metrics Kafka Connect API Dead Letter Topic (DLT) You can also consider implementing a Dead Letter Topic (DLT) as a safety net to catch and handle any messages that repeatedly fail during retries, ensuring they are not lost, and on top of that build a reprocessing strategy for all DLT messages. Debezium Debezium is an open-source CDC platform built on top of Kafka Connect. It reads directly from database transaction logs, capturing changes in real time and publishing these events to Kafka topics. This enables the integration of this data with multiple consumers and facilitates the development of real-time data pipelines. Architecture Debezium runs as a Kafka Connect connector and uses a specific connector for each database type. When configured, Debezium reads transaction logs and streams changes to topics in Kafka, where consumers can access these events. This process allows Debezium to be highly scalable and easy to integrate with other applications and systems. Benefits of Debezium: Open-source and flexible: being free and extensible. Scalable with Kafka: ideal for distributed and high-demand systems. Consistency and reliability: captures changes without intervening in the data structure. Debezium Transformations (SMTs) Single Message Transformations (SMTs) in Debezium are transformations that can be applied to individual messages during Change Data Capture (CDC) processing. These transformations occur after reading data from the source and before writing it to the kafka or target system. Transformation Types Routing Transformations Allow you to modify how messages are routed within the system. RegexRouter: Modifies the target topic using regular expressions ByteBufferConverter: Converts fields between different byte formats TopicRouter: Redirects messages to specific topics based on conditions Content Transformations Modify the content of messages. ExtractField: Extracts a specific field from the message InsertField: Adds static or dynamic fields ReplaceField: Replaces values ​​of existing fields MaskField: Masks sensitive data DropFields: Removes specific fields Value Transformations Change values ​​within messages. ValueToKey: Moves a value to the message key ValueToTimestamp: Converts a field to a timestamp Cast: Converts data types TimestampConverter: Converts timestamp formats Structural Transformations Modify the structure of messages. Flatten: Converts nested structures to flat structure HoistField: Raises nested fields to the top level WrapField: Encapsulates fields in a new structure Common Use Cases Sensitive Data Masking { "transforms": "maskFields", "transforms.maskFields.type": "org.apache.kafka.connect.transforms.MaskField$Value", "transforms.maskFields.fields": "creditCard,ssn", "transforms.maskFields.replacement": "****" } 2. Field Renaming { "transforms": "RenameField", "transforms.RenameField.type": "org.apache.kafka.connect.transforms.ReplaceField$Value", "transforms.RenameField.renames": "oldName:newName" } 3. Metadata Addition { "transforms": "insertSourceDetails", "transforms.insertSourceDetails.type": "org.apache.kafka.connect.transforms.InsertField$Value", "transforms.insertSourceDetails.static.field": "source_database", "transforms.insertSourceDetails.static.value": "production_db" } Performance Considerations Execution Order: Transformations are executed in the order they are listed Performance Impact: Each transformation adds overhead to processing Complexity: Complex transformations can affect latency Best Practices Minimize Transformations: Use only the necessary transformations Efficient Order: Place more selective transformations first Monitoring: Monitor the impact of transformations on performance Testing: Validate transformations in a development environment Documentation: Maintain clear documentation of the applied transformations Limitations Some transformations may not preserve original data types Complex transformations may affect message order guarantee Not all transformations are Compatible with each other. Some transformations may not work with all serialization formats. Usage Recommendations 1.  Requirements Analysis: Clearly identify the transformation needs. Assess the impact on the system as a whole.  Implementation: Start with simple transformations. Test each transformation individually. Document all applied transformations. 3.  Maintenance: Monitor performance regularly. Periodically review the need for each transformation. Keep transformations updated with new versions of Debezium. Use Cases: CDC can be applied in a variety of scenarios to optimize real-time data integration and replication, some of which are referred to below. Outbox Pattern (monitors and forwards events to a Kafka (e.g.), ensuring that messages are published, even with failures, and decoupling the logic from message delivery.) Data Mirroring (monolith -> microservice migration) Data Replication (Integration / microservices -> microservices) Data Auditing Real-time Analytics (real-time dashboards and reports) Data Caching (Caching certain Postgres data in Redis, for example) Data for Query (making data available in a search engine like ElasticSearch) CQRS (making Command data available in Query data sources) etc... Practical Use of Debezium Infrastructure Preparation Ensure that Docker and Docker Compose are installed and running, as we will need to upload Kafka, Zookeeper, Debezium Connector and the Database. Creating Docker Compose Create a docker-compose.yml file with the following content: version: '3.8' services: postgres: image: debezium/postgres:16 ports: - "5433:5432" environment: - POSTGRES_DB=inventory - POSTGRES_USER=postgres - POSTGRES_PASSWORD=postgres command: postgres -c wal_level=logical networks: debezium-network: aliases: - postgres zookeeper: image: confluentinc/cp-zookeeper:7.5.0 ports: - "2181:2181" environment: ZOOKEEPER_CLIENT_PORT: 2181 ZOOKEEPER_TICK_TIME: 2000 networks: debezium-network: aliases: - zookeeper kafka: image: confluentinc/cp-kafka:7.5.0 depends_on: - zookeeper ports: - "9092:9092" environment: KAFKA_BROKER_ID: 1 KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181 KAFKA_ADVERTISED_LISTENERS: INTERNAL://kafka:9093,EXTERNAL://localhost:9092 KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: INTERNAL:PLAINTEXT,EXTERNAL:PLAINTEXT KAFKA_LISTENERS: INTERNAL://0.0.0.0:9093,EXTERNAL://0.0.0.0:9092 KAFKA_INTER_BROKER_LISTENER_NAME: INTERNAL KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1 networks: debezium-network: aliases: - kafka kafka-connect: image: debezium/connect:2.5 ports: - "8083:8083" depends_on: - kafka - postgres environment: BOOTSTRAP_SERVERS: kafka:9093 GROUP_ID: 1 CONFIG_STORAGE_TOPIC: connect_configs OFFSET_STORAGE_TOPIC: connect_offsets STATUS_STORAGE_TOPIC: connect_statuses KEY_CONVERTER: org.apache.kafka.connect.json.JsonConverter VALUE_CONVERTER: org.apache.kafka.connect.json.JsonConverter CONNECT_KEY_CONVERTER_SCHEMAS_ENABLE: "false" CONNECT_VALUE_CONVERTER_SCHEMAS_ENABLE: "false" CONNECT_REST_ADVERTISED_HOST_NAME: kafka-connect networks: debezium-network: aliases: - kafka-connect networks: debezium-network: driver: bridge Running the services In the folder where docker-compose.yml was created, run (in the terminal): # depending on how your docker was installed, # you will need to use 'sudo' before the docker compose up command Debezium Connector Configuration Send a JSON configuration to the Kafka Connect REST endpoint to configure the Debezium connector: # 1. First, check if Debezium Connect is running: curl -X GET http://localhost:8083/connectors # 2. Creating the connector # 2.1. Main command to create the connector (WITHOUT transformation / start with this): curl -X POST http://localhost:8083/connectors \ -H "Content-Type: application/json" \ -d '{ "name": "inventory-clientes-connector", "config": { "connector.class": "io.debezium.connector.postgresql.PostgresConnector", "database.hostname": "postgres", "database.port": "5433", "database.user": "postgres", "database.password": "postgres", "database.dbname": "inventory", "database.server.name": "dbserver1", "table.include.list": "public.clientes", "topic.prefix": "inventory", "schema.include.list": "public", "decimal.handling.mode": "precise", "plugin.name": "pgoutput" } }' # 2.2. Main command to create the connector (COM transformation): curl -X POST http://localhost:8083/connectors \ -H "Content-Type: application/json" \ -d '{ "name": "inventory-clientes-connector", "config": { "connector.class": "io.debezium.connector.postgresql.PostgresConnector", "database.hostname": "postgres", "database.port": "5433", "database.user": "postgres", "database.password": "postgres", "database.dbname": "inventory", "database.server.name": "dbserver1", "table.include.list": "public.clientes", "topic.prefix": "inventory", "schema.include.list": "public", "decimal.handling.mode": "precise", "plugin.name": "pgoutput", "transforms": "unwrap", "transforms.unwrap.type": "io.debezium.transforms.ExtractNewRecordState", "transforms.unwrap.drop.tombstones": "false", "transforms.unwrap.delete.handling.mode": "rewrite" } }' # Example of a configuration of a Sink connector for Redis curl -X POST http://localhost:8083/connectors \ -H "Content-Type: application/json" \ -d '{ "name": "redis-sink", "config": { "connector.class": "com.github.jcustenborder.kafka.connect.redis.RedisSinkConnector", "redis.hosts": "redis:6379", "topics": "inventory.public.clientes", "tasks.max": "1", "key.converter": "org.apache.kafka.connect.json.JsonConverter", "value.converter": "org.apache.kafka.connect.json.JsonConverter", "key.converter.schemas.enable": "true", "value.converter.schemas.enable": "true" } }' # 3. To check the connector status: curl -X GET http://localhost:8083/connectors/inventory-clientes-connector/status Parameter Explanation: name: Name of the connector, used for management (can be customized). connector.class: Specifies the Debezium connector for PostgreSQL (io.debezium.connector.postgresql.PostgresConnector). tasks.max: Maximum number of parallel tasks for the connector (may increase depending on load). database.hostname: PostgreSQL server address. database.port: Port on which PostgreSQL is running (default is 5432). database.user and database.password: Username and password to access the database (the user needs permissions to read the transaction logs). database.dbname: Name of the database to be monitored. database.server.name: Logical name of the server; will be used as a prefix for Kafka topics created for the tables. slot.name: Name of the replication slot. This changes data in real time using logical replication. plugin.name: Defines the logical replication plugin. For PostgreSQL 10+ and Debezium, pgoutput is recommended. table.include.list: List of tables to monitor (in schema.table format). If you want to monitor multiple tables, you can separate the names with commas. publication.name: Name of the logical replication publication that captures changes to the specified table. database.history.kafka.bootstrap.servers: Kafka address where the schema history will be stored, essential for Debezium to understand the table structure and schema changes. database.history.kafka.topic: Kafka topic where the schema history will be stored. Monitoring Changes kafka-console-consumer Kafka stack tool for consuming topics directly in the console. # Console Consumer - check messages in topic sudo docker exec -it cdc-kafka kafka-console-consumer \ --bootstrap-server localhost:9092 \ --topic inventory.public.clientes \ --from-beginning Plugin Big Data Tools for IntelliJ Provectus Kafka UI GitHub - provectus/kafka-ui: Open-Source Web UI for Apache Kafka Management kafka-ui: image: provectuslabs/kafka-ui:latest container_name: cdc-kafka-ui ports: - 8080:8080 depends_on: - kafka environment: DYNAMIC_CONFIG_ENABLED: 'true' networks: debezium-network: aliases: - kafka-ui "Seeing the magic happen" Create the table -- Create the table CREATE TABLE inventory.public.clientes ( SERIAL PRIMARY KEY id, name VARCHAR(50), email VARCHAR(50), data_criacao TIMESTAMP DEFAULT CURRENT_TIMESTAMP ); -- Checks if there is data SELECT * FROM inventory.public.clientes; Connecting the Consumer You can connect the default Kafka consumer to consume messages as soon as they enter the topic: # Console Consumer - checks messages in the topic sudo docker exec -it cdc-kafka kafka-console-consumer \ --bootstrap-server localhost:9092 \ --topic inventory.public.clientes \ --from-beginning Or use one of the tools listed above (IntelliJ plugin or Kafka UI). Testing how it works Inserting data INSERT INTO inventory.public.clientes (name, email) VALUES ('João Silva 1', 'joao1@email.com'); INSERT INTO inventory.public.clientes (name, email) VALUES ('João Silva 2', 'joao2@email.com'); INSERT INTO inventory.public.clientes (name, email) VALUES ('João Silva 3', 'joao3@email.com'); Updating data UPDATE inventory.public.clientes SET name = 'Pedro Silva', email = 'pedro.silva@email.com' WHERE id = 1; UPDATE inventory.public.clientes SET name = 'José Silva', email = 'jose.silva@email.com' WHERE id = 2; UPDATE inventory.public.clientes SET nome = 'Jacó Silva', email = 'jaco.silva@email.com' WHERE id = 3; Deleting data -- Deletes the record DELETE FROM inventory.public.clientes WHERE id = 1; DELETE FROM inventory.public.clientes WHERE id = 2; DELETE FROM inventory.public.clientes WHERE id = 3; Changing the amount of replication information -- Changes the amount of information that is available for logical decoding in case of UPDATE and DELETE events. ALTER TABLE inventory.public.clientes REPLICA IDENTITY FULL; Rerun the tests (insert, update, and delete) after adjusting the replication level. Troubleshooting # Viewing the logs sudo docker compose --file debezium-docker-compose.yml logs kafka sudo docker compose --file debezium-docker-compose.yml logs zookeeper sudo docker compose --file debezium-docker-compose.yml logs postgres sudo docker compose --file debezium-docker-compose.yml logs kafka-connect # Inspecting the network sudo docker network inspect debezium-network Conclusion Change Data Capture is a technique widely used by the market, in its various forms, in its various tools. It is especially useful in data and system migrations, facilitating the throttling of a monolith or replicating some data from one microservice to another, or simply applied to the Outbox pattern. In this scenario, Debezium presents itself as a great tool for this, based on Kafka's already tested and proven Connect platform. But as in any scenario, it has its points of attention, risks and disadvantages. It is essential that technical teams are aware of both the potential and limitations of the tool, ensuring well-informed architectural decisions.

Habemus Confluence!
Tech Writers September 30, 2025

Habemus Confluence!

Projuris Empresas using Confluence. Everyone knows that documentation is an important process in product management. It preserves legacy, records decisions, and serves as a comprehensive collection so that everyone understands what was done and why in the future. That said, it also acts as a communicator and educator, providing users with all the information about the product. Okay, but how can we achieve this goal and deliver everything they need without having to "search for information"? The answer is quite simple: by setting up a Help Center. Where to start? Choose the right tool. The important thing here is that it be dedicated to documentation, designed with this purpose in mind; it makes all the difference. Previously, here at Projuris Empresas, we used Movidesk, a support tool that offered a small knowledge base functionality, which hindered the structuring of a more intuitive content organization that made sense to the user and had a usability that limited the availability of resources during consultation. We understood what needed improvement and looked for tools that met this expectation and had documentation at the center of their functionality. That's how we arrived at Confluence (which was already used and praised by our colleagues in Projuris ADV), its purpose is to be a documentation tool and is even one of the references in this regard (Habemus Confluence!). Design your structure Now that the tool is defined, it's time to think about the structure. How to present this content in a way that will make sense to the user and facilitate access to the information. We solved this by literally drawing (very artistic they are): we sat down and thought about all the pre-existing content and how to present it, the main idea was that it should reflect our product and be intuitive. In this way, we created large categories that served as groupings for articles and other content. This also helped at the time of migration, because since we already had the “skeleton” we only needed to move the content to its correct category. Start migration The most anticipated moment has arrived: migration. Once the structure was defined, we started the migration process, transferring exactly 618 articles (my friends, it was hard work!) and unfortunately, as Movidesk did not provide a backup in a format compatible with Confluence, we had to resort to the good old “copy & paste”. And here we have an important point: if the chosen tool was not designed for documentation, this process would be even slower. As Confluence is designed for this, when pasting the content, it would already adapt, requiring little interference. Despite all this work, we still completed the entire migration in less than two weeks. Oh, and an important consideration: today, if we need to export all our content, it can be done quickly in just a few minutes, because Confluence allows exporting in “.pdf, .docx, html, and xml” formats, if necessary (a new migration today would be a cinch!). Align the details After the migration, it was time to take care of the details, that is, look at the structure of the content itself. We validate links, images, gifs, define titles, patterns, etc. At this stage, it's important to put the user at the center of your decisions, as the content must align with the intended structure: Make sure the information presented in the articles is logical and clear; that the articles are structured correctly; and, most importantly, that they are searchable and easy to find. Details that make a difference at the time of delivery as a whole. Here, we rely on AI to help us review and create essential articles, especially those aimed at user onboarding. The only significant problem we had in this process was in relation to the images and gifs that are used to illustrate our documentation, because this type of file cannot be copied and pasted like the rest of the content, they must be added individually (yes, one by one), otherwise, a link is created between the two tools and after a while the images break, as they lose the hosting references, so it is good to be careful and attentive in this regard, especially when migrating a considerable amount of content (as was our case). Think about design Presentation is the key to business! Once the entire “body” of our plant was built, we began to worry about the design itself. To help with this, we hired Refined, a plugin that allows you to customize Confluence. So, it was our Product Designer's turn to get involved. Once again, we drew the structure we wanted (I mentioned that they are artists), and this served to visually validate whether it would make sense and guide the specialist. From this drawing, he began to develop the design of the central, considering the product's visual identity and best usability. In this step, we again thought about the correlation between product and core, we brought many references from the system to it, not only the color palette and logos, but also icons, structure, nomenclature, all of this so that the user would feel familiar with the context and establish this relationship between the two things (later you will understand why this made sense). Furthermore, we are not limited to just using Confluence content, we also link links from other platforms with Projuris information, such as YouTube, Blog, Website, Ticket Portal, etc. This allows the user to have access to everything about this universe in a single place. Validate before launching So, we have the structure, the content, the design, and now what? Can we put it on air now? It could… but how can we be sure that it is in line with our user’s needs? That there is no possibility of improvement? Just testing, right?! Therefore, before publishing the new hub in the product, we decided to invite internal and external customers for a validation test. We defined a small execution roadmap (search, navigation, naming validations, structure, etc.), and asked people to provide feedback to further align our delivery. The result of this? In addition to feedback that contributed to the project's assertiveness (that "ok" that we made the right decision), we were able to correct the course before final delivery, changing nomenclature that wasn't very clear, reordering content in the structure, and creating new content based on suggestions. Remember I talked about the intention of creating a relationship between the product and the hub? This was confirmed here, we could see that users found their way around much more easily because they related their contracted modules to the presentation available at the center, demonstrating familiarity. Launch and promote widely! Now, we can say that everything is “rounded” (at least to start with) and after all the validations, it’s finally time to publish. We started by asking the engineering team to insert the link to the new center directly into Projuris. This was already our practice, so it was only necessary to replace the old access. At the same time, we aligned the dissemination strategy with the marketing team. When everything was ready, we announced it on internal channels (teams and email), in the product (through Beamer), in the Heroes community (a community made up of Projuris customers) and in email marketing (phew, we made a fuss!). Next Steps As I said, this is just the beginning. Documentation is a cultural endeavor, and people need to be educated to understand both the importance of producing it and the habit of consuming it. So, from now on, our next steps include curating the new hub, collecting data to analyze opportunities for improvement, integrating with AI initiatives, and producing new content. But the question remains: are you ready to upgrade your help center? If you want to see up close how we did this inProjuris Empresas, click here to explore our new Help Center and let us know what you think. I'd love to share my thoughts on the challenges (and triumphs!) of this process. Send me a message, and let's build even more amazing documentation together!

What do the failure of the Cruzado Plan and mistakes in Product Management have in common?
Tech Writers September 24, 2025

What do the failure of the Cruzado Plan and mistakes in Product Management have in common?

In 1986, Brazil bet everything on an ambitious plan to curb inflation: a new currency (the cruzado), a price freeze, and a bold promise of economic stabilization. The euphoria was immediate. For a brief period, it seemed the country had finally overcome hyperinflation, but the optimism was short-lived. Within a few months, inflation returned with a vengeance—and the plan collapsed. But why did the Cruzado Plan fail? Simply put, the Brazilian government failed to make a clear and precise diagnosis of inflationary dynamics. Francisco Lopes, one of the plan's creators, later acknowledged that the government underestimated the complexity of inertial inflation and overestimated the power of a price freeze without transition mechanisms (LOPES, 1989). There were also internal political disputes between ministries that hindered the construction of a more sustainable plan. Luiz Carlos Bresser-Pereira, an active member of the government at the time, was one of the first to point out that the plan lacked fiscal support and more flexible adjustment mechanisms. The result? A fleeting relief, followed by an even more disorganized return of inflation, culminating in the plan's failure in less than a year. And what does this have to do with Product Management? More than it seems. How often, in Product Management, do we feel pressured to deliver a solution quickly? How many times do we skip the diagnostic phase? How many times do we jump straight into development, without fully understanding the user's pain point, the usage scenario, or even whether the problem even exists? Driven by urgency, we deliver quickly—and poorly. And then what we launch doesn't solve the real pain point, doesn't create value, and can even create more confusion than solution. Take, for example, the new Title Query Screen we created for SiengeWe believed that what users valued most was a comprehensive screen, full of filters and with maximum information available. Although we validated some of these hypotheses—and some of them were even confirmed—at launch, we realized that the reality was different: engagement was lower than expected, and during the pilot, 60% of customers still preferred to use the old screen. Faced with this scenario, we began active listening: we collected feedback, analyzed the main friction points, and delved deeper into user behavior. We discovered that the main pain point wasn't the lack of information, but rather the excess of it. For some customers, the new screen seemed too cluttered and made navigation difficult. On the other hand, there was also a group that valued broad and detailed access to data. In other words, what seemed like a functionality issue was actually a flexibility issue. The solution only emerged from this deeper understanding: we created two complementary versions—a synthetic screen, lighter and more objective, and another with more detailed analysis of the results. This way, we can serve different user profiles, respecting their operating methods. The synthetic version will undergo pilot testing in the coming months. Let's look at the differences between the proposed solutions below: Synthetic Version Analytical Version What does the theory say? This reality has already been widely discussed in Product best practices, such as those advocated by Marty Cagan in "Inspired" (2017), who emphasizes that the best products are born from a deep understanding of users' problems—not from rushing to deliver something. Melissa Perri also highlights in "Escaping the Build Trap" (2018) that teams that fail to diagnose and prioritize real problems end up falling into the trap of "delivering for the sake of delivering," with no impact. Diagnosis matters—a lot. Understanding the context, investigating the causes, truly listening to the pain points: this makes all the difference. In the Cruzado Plan, the lack of a solid diagnosis compromised the entire strategy. In Product Management, the same thing happens when we skip discovery and jump straight to delivery. References Bresser-Pereira, Luiz Carlos. The Crisis of the State. Nobel Prize, 1992. Cagan, Marty. Inspired: How to Create Tech Products Customers Love. Wiley, 2017. Lopes, Francisco. The heterodox shock of the Cruzado Plan and the theory of inertial inflation. Journal of Political Economy, 1989. Perri, Melissa. Escaping the Build Trap: How Effective Product Management Creates Real Value. O'Reilly Media, 2018.

Intelligent Optimization: How We Use AI to Transform Processes into Productivity
Tech Writers September 10, 2025

Intelligent Optimization: How We Use AI to Transform Processes into Productivity

This article was prepared with the support of Artificial Intelligence tools, used to structure and organize information, with human review and supervision at all stages. In the corporate world, time is the scarcest resource and, often, indiscriminately, the most poorly used. Thousands of hours are wasted on repetitive tasks, manual processes, and low-strategic-value demands. Why is this important? The impact of inefficiency goes far beyond the financial: it compromises productivity, reduces engagement, and weakens the strategic alignment of teams, directly affecting the capacity for innovation and organizational competitiveness. It is in this context that Artificial Intelligence (AI) ceases to be “just another” technological innovation and begins to act as a strategic pillar for organizational transformation, driving the reconfiguration of processes and the creation of new work models. We're not just talking about automation, but about changing the very role of people within organizations, creating space for human talent to be invested in creativity, strategy, and innovation. The Numbers Manual processes that don't require deep or strategic analysis can be real productivity saboteurs. AI-powered automation gives you back valuable hours and, more importantly, gives you back focus, purpose, and autonomy. A study carried out by BCG X (2024) showed that professionals who adopted AI gained, on average, 5 hours per week to perform even more higher-value activities. From a business perspective, this means entire weeks of extra productivity per year. And the impacts go beyond those interviewed: 41% reported greater efficiency in deliveries. 39% expanded their responsibilities to new tasks. 38% began to act more strategically. The OECD (2025) reinforces that companies that apply AI to operational tasks gain between 5% and 25% in productivity. The gain is not only in the reduction of time, but in the repositioning of human labor. And yet, according to the technology company, HP, which conducted global research interviewing more than 15 people in 12 countries, found that 73% of respondents say AI makes work easier, and 69% personalize the use of AI to increase productivity. Overview – The world has already changed. And work too. The benefits of Artificial Intelligence, when used well and strategically integrated into our work routine, are clear and indicate that we will need to change the way we are used to working. According to global consultancy McKinsey, more than 50% of occupations have at least 30% of their activities that can be automated with already available technologies. By 2030, half of all work activities could be performed entirely or partially by AI. While this data may seem worrying, the most productive interpretation is optimistic: AI will not steal jobs, but rather enhance the execution of repetitive and routine tasks, freeing up space for professionals to develop more human and creative skills. The question is no longer “if” you will adopt AI, but how you will use it responsibly, strategically, and measurably. Where AI is strong and humans are irreplaceable Kai-Fu Lee, a computer scientist and AI expert, says that Artificial Intelligence has come to free us from routine work and remind us of what makes us human. In his book Artificial Intelligence (2019), he recounts the defeat of Ke Jie, the best human Go player, by AlphaGo (an AI), as a symbol of the potential of human-AI interaction: The human brings purpose, feeling and creativity to everything he does. AI executes and follows commands accurately. Based on this, Lee created a matrix that helps us better understand where AI can enhance processes and where human presence remains essential: Needs social interaction/empathy No social interaction or empathy Source: Kai Fu Lee The matrix considers two dimensions: Cognitive complexity and creativity: tasks that require judgment, empathy, strategic decision-making, or innovation. Routine and data: repetitive, predictable, or data-driven tasks. Based on these dimensions, processes can be classified into three categories: AI alone: ​​activities that are standardized, predictable, or based on rules and data. These are processes that can be automated without loss of quality or risk. Human alone (with the possibility of using AI for support): activities that require contextual understanding, creativity, empathy, or strategic decision-making. These are processes that cannot be delegated exclusively to AI without compromising the result. Human + AI: processes where AI performs repetitive or analytical tasks, freeing the human to focus on what requires judgment, strategic vision, or creativity. This combination generates more efficient results and allows humans to act with high value. This way, any area can analyze its processes, thinking about what can be safely automated and what requires human intelligence and sensitivity. The matrix serves as a guide to prioritize where to invest in AI and where to keep humans at the center of the process. According to Kai-Fu Lee, there are three areas in which AI falls short of human capabilities and is unlikely to be able to do so in the coming decades: Creativity – AI does not create, conceptualize, or plan strategically. It only optimizes defined objectives, without the ability to set goals, apply knowledge in different areas or use common sense. Empathy – AI cannot feel or interact based on emotions. It does not convey genuine care, understanding, or motivation, and is unlikely to replace the human touch in services that depend on that connection; Physical dexterity – AI cannot perform complex physical work that requires dexterity or motor coordination, or handle unfamiliar and unstructured spaces, especially those it has not observed.  The future of work is not about humans or machines, but about the synergy between the two, a collaboration in which AI enhances what we do best, while we give direction, purpose, and creativity to what it does. 6. Starian Case andSoftplanSoftplanidentified a strategic opportunity to optimize two essential, yet operational and time-consuming activities: contract analysis and responding to privacy questionnaires. Contracts: need to be implemented and reviewed to ensure adherence to internal policies and legislation. Privacy questionnaires: require substantiated responses based on regulations and internal documents, are extensive and crucial for transparency and consistency. Despite its importance, the process was manual, followed internal standards, and required many hours, which increased the risk of errors as demand grew. The team saw AI as a chance to transform this scenario. From idea to implementation With IT support, the solution was created in a structured and secure way. Operating in a closed environment, which guarantees greater data protection and privacy. Development followed five steps: Prioritize activities with the greatest impact and volume. Define AI boundaries while keeping critical decisions under human oversight. Train the model with contracts, internal policies, and relevant references. Configure standards and commands for the organizational context. Integrate the final delivery with comments and suggestions for review by the privacy team. The result: a tool that analyzes contracts and questionnaires, delivers the document ready for review, and maintains consistency and quality. Impact on productivity The gains were tangible: Contracts: Considering an average of 70 contracts received per month, each one took an average of 2 hours to analyze. With AI, we save 30 minutes per contract, equivalent to 35 hours/month or 420 hours/year. Privacy questionnaires: There was a 50% reduction in time, saving 2h30 per questionnaire, equivalent to 50h/month or 600h/year. This automation freed the team to make strategic decisions without compromising the quality or reliability of the processes. Learnings and Best Practices Throughout the implementation, the Starian Privacy team realized that the success of AI depends on the balance between technology and human oversight. Some points stood out: Personalization is key: AI needs to be trained and adjusted to the specific business context, demands, and activities. Every detail counts. Continuous supervision is essential: hallucinations and errors can occur, and human intervention ensures accuracy and adequacy. Data quality matters: the better the inputs, the better and more reliable the results generated. Constant interaction generates improvements: real feedback from the team improves the tool and creates continuous evolution. Critical decisions remain human: AI is an ally, but strategic analysis remains an essentially human condition. Contracted solutions increase security: validated tools ensure consistency and protection. These lessons show that agility and caution need to go hand in hand. The inappropriate use of AI can generate serious risks: financial losses, leakage or improper retention of personal, sensitive, and confidential data, creation of false information, and excessive reliance on insecure systems. Therefore, some practices are essential: Avoid entering personal data or confidential documents, especially in non-contracted tools. Always review the generated content. Question biases or distortions in the results. Use only solutions approved and suitable by the organization. Train the team and record the risks of each use together with the privacy and information security areas. At Starian, we've learned that when humans and AI work in synergy, the benefits go far beyond productivity: we deliver greater quality, safety, and time to focus on what really matters—the strategic decisions that only humans can make. Technology enhances, but it is the human eye that ensures consistency and trust. Conclusion: The experience of Starian andSoftplan

No-code: 5 lessons learned from a real project
Tech Writers August 26, 2025

No-code: 5 lessons learned from a real project

Recently, I led the Product and UX teams in developing an AI SaaS solution for SEO, with a strategic focus on adopting technologies that accelerate the pursuit of Product-Market-Fit (PMF) and Minimum Viable Product (MVP). The chosen approach also aimed to optimize crucial factors, such as reducing development costs and shortening the learning curve, ensuring a more efficient and accessible process. Therefore, we decided to use the following stack of solutions for software development: Figma (prototyping); No-code solution for web front-end; Initially, we were going to use a low-code backend, but we pivoted to high-code technology. One of the main advantages of this stack is that the front-end development speed would be much faster thanks to the facilitators, but we kept the core of the solution in the backend using high-code technology, allowing for scale with greater cloud efficiency and flexibility. I can explain the rationale and what options we considered in a future post, but let's move on here (comment here if you'd like to read about it). 5 lessons learned from no-code: 1. Greater speed: It is indisputable that a no-code solution brings more agility to development. I believe we have reduced front-end development time by around 80%. It's really nice to see how easy it is to create interfaces and workflows, especially when you can simply use a plugin and import the screens directly from Figma. If you're a UX or PM, imagine validating deliveries and making adjustments quickly. Even better: you can create the interfaces yourself! I did this myself several times throughout the project. We often spend time documenting small usability tweaks for developers to fix, but with this approach, you could simply select the layer and tweak it directly, without any intermediaries. There is one positive point worth highlighting: The no-code software we use has an auto-layout structure very similar to Figma, if you master Figma, you will be able to do it easily. 2. There will be a learning curve: There are few professionals with practical experience using low/no-code technology in the country. This means that you will probably have to train the team to deal with this technology, and naturally this will generate some operational errors, bugs, and will fry some neurons. The learning curve is much smaller than with high-code technology, but it will take some time to build a mature team. 3. You'll have to rethink some processes: team management, branches, merges, homologation servers, deployments... You'll have to think about these processes to ensure quality deliveries and mitigate errors. Imagine someone making a “misclick” on no-code software and uploading the wrong branch to production… 4. New problems arise: Not everything is perfect. When using a platform to develop software, you may end up becoming dependent on the company that provides the platform (the famous vendor lock-in), so it is important to consciously choose which technology you will use. In the case of the software we chose, it is possible to export the project in code, so if in the future you want to remove it from the platform, it is possible, although I have never done it, have you ever exported a project in production? Tell us how it went. Additionally, you start to encounter different problems, for example, there was a screen that had a behavior that with a simple “for” loop could be resolved in the code, but in no-code it was a challenge… Print of a call where we were trying to solve this problem: 5. There will be paradigm shifts: As with any change, some people may feel threatened, afraid, or resistant. You need to find people willing to relearn many things and explore new things. We didn't have any resistance in this project, but I've seen people in other environments who still believe these solutions don't scale and don't work. Conclusion The use of low-code/no-code tools, such as WeWeb, has significantly helped us accelerate experimentation and software development. However, it's not all sunshine and roses: along the way, we had to rethink several DevOps processes as specific nuances of maintaining low-code software in production emerged. The real challenge for any business is to deeply understand customer needs and find effective ways to increase engagement, perceived value, acquisition, and retention. Furthermore, every software product carries an often-overlooked cost: the opportunity cost. This cost represents the time and resources we decide to invest in a given initiative (time that could be directed towards other, potentially more valuable, investments). By accelerating development with low-code/no-code, we've been able to substantially reduce this cost, allowing the team to focus on what really matters: delivering value to the customer, not just writing code.